Re: Optimization using weights

Alan Weiss <aweiss@mathworks.com> wrote in message <nafb6l$mur$1@newscl01ah.mathworks.com>...
> On 2/22/2016 10:42 AM, someone wrote:
> > "Alessandro De Sanctis" wrote in message
> > <naf92r$if5$1@newscl01ah.mathworks.com>...
> >> Hello,
> >>
> >> I have to maximize a likelihood in which to every observation
> >> correspond a specific (non-integer) weight. In particular, I am
> >> referring to sampling weights, which denote the inverse of the
> >> probability that the observation is included in the sample.
> >>
> >> I tried by expanding the dataset (so that an observation with weight =
> >> 100 is repeated 100 times) but the dataset became extremely large and
> >> it's the second week that fminsearch is running.
> >>
> >> My ultimate goal would be to estimate a non-linear model with a binary
> >> dependent variable and weights to observations.
> >>
> >> Please any alternative idea on how to proceed is welcome. Thank you in
> >> advance.
> >> Alessandro
> >
> > To help us help you, can you show us a small snippet of your code? The
> > above description is pretty vague and doesn't give us much to go on.
>
> In particular, what is the mathematical form of your objective function,
> meaning the function you are trying to minimize? There is probably a
> shortcut that you can take in your function definition to account for
> weights, rather than adding new rows to the dataset.
>
> Also, fminsearch is not the fastest or most robust optimizer in
> Optimization Toolbox. You might do better to try fminunc, or another
> appropriate solver.
>
> Alan Weiss
> MATLAB mathematical toolbox documentation

Thanks, I'm now using fminunc. Moreover, I've just found a way to deal with adding rows which should save lots of time. I will now run this version of my program.

I try to be more clear. I am working on a dataset of N=60,000 observations. My teoretical model is in the form

y = b0 + b1 * A1(lambda_1,data) + b2 * A2(lambda_2,data) + controls * b + error

where y is a binary variable, A1 and A2 are functions of two parameters and data, and controls is a matrix (60,000 x 95) of regressors.

------------------------------------------------------------------------------------------------------

%%% 1) I load data and starting values (start_vals), and expand the dataset by rounding weights to the closest integer. The following command is new and I haven't tried it yet on the whole dataset, it will probably take hours. The final dataset will have dimension N = 134,985,980.

weights = round(data(:,end));

DATA = [];
for i = 1:length(weights)
   DATAi = data(i,:);
   DATAi = repmat(DATAi,weights(i),1);
   DATA = [DATA; DATAi];
end

%%% 2) I run the optimization :

[MLE loglike_val] = fminunc(@(parameters) loglike2_complete(parameters,DATA),start_vals)

%%% Where loglike2_complete works as follows :

function L = loglike2_complete(param,DATA)

%%% 3.a) I compute A1 and A2 (following a teoretical model where A1 and A2 are a weighted sum -other weights- of elements in matrices R and C ) :

N = length(DATA);
A1 = zeros(N,1);
A2 = zeros(N,1);
age = controls(:,10);

for i = 1:N % elements of A1 and A2
    % A1(lambda1)
    agei = repmat(age(i),age(i)-1,1); % vector of age(i)
    k = (1:age(i)-1)'; % numbers from 1 to age(i)-1
    num = (agei-k).^lambda1;
    den = sum((agei-k).^lambda1);
    w = num ./ den;
    Ri = R(i,~isnan(R(i,:))); % select only non missing values of R for every id
    A1(i) = Ri * w;

    % A2(lambda2)
    num = (agei-k).^lambda2;
    den = sum((agei-k).^lambda2);
    w = num ./ den;
    Ci = C(i,~isnan(R(i,:)));
    A2(i) = Ci * w;
end

%%% 3.b) I write the model in the form y = X * beta, where beta is start_vals excluded lambda1 and lambda2 :

X = [ones(N,1) A1 A2 controls];

%%% 3.c) I compute the function I want to minimize :

L = -(sum(Y.*log(normcdf(Z*beta,0,1))) + sum((1-Y).*log(1-normcdf(Z*beta,0,1))));

------------------------------------------------------------------------------------------------------

I think there are faster ways to expand the dataset and to compute A1 and A2. But what I'd like to know is a way to deal with those weights without expanding the dataset (also because I'm rounding the weights).

P.S. I've not seen the output of this program yet.

Re: Optimization using weights

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112