Using Matlab to randomly split an Excel Sheet -


i have excel sheet containing 1838 records , need randomly split these records 3 excel sheets. trying use matlab quite new , have managed following code:

[xlsn, xlst, raw] = xlsread('data.xls');  numrows = 1838;   randindex = ceil(3*rand(numrows, 1));  raw1 = raw(:,randindex==1); raw2 = raw(:,randindex==2); raw3 = raw(:,randindex==3); 

your general procedure read spreadsheet matlab variables, operate on matrices such end 3 thirds , write each third out.

so you've got read covered xlsread, results in 2 matrices xlsnum , xlstxt. suggest using syntax

[~, ~, raw] = xlsread('data.xls'); 

in xlsread file (you can access typing doc xlsread command window) says 3 output arguments hold numeric cells, text cells , whole lot. because matlab matrix can hold 1 type of value , spreadsheet expected have text or numbers. raw value hold of values in 'cell array' instead, different kind of matlab data type.

so have cell array valled raw. here want 3 things:

  1. work out how many rows have (i assume each record row) using size function , specifying appropriate dimension (again check file see how this)
  2. create index of random numbers between 1 , 3 inclusive, can use mask

    randindex = ceil(3*rand(numrows, 1));

  3. apply mask cell array extract records matching each index

    raw1 = raw(:,randindex==1); % same other 2 index values

  4. write each cell file

    xlswrite('output1.xls', raw1);

you have fettle arguments work way want sure check doc functionname page syntax right. main concern indexing correct - matlab indexes row-first whereas spreadsheets tend column-first (e.g. cell a2 column , row 2, matlab matrix element m(1,2) first row , second column of matrix m, i.e. cell b1).

update: split file evenly surprisingly more trouble: because we're using random numbers index it's not guaranteed split evenly. instead can generate vector of random floats , pick out lowest 33% of them make index 1, highest 33 make index 3 , let rest 2.

randvec = rand(numrows, 1); % float between 0 , 1 pct33 = prctile(randvec,100/3); % value of 33rd percentile pct67 = prctile(randvec,200/3); % value of 67th percentile randindex = ones(numrows,1); randindex(randvec>pct33) = 2; randindex(randvec>pct67) = 3; 

it still won't absolutely - 1838 isn't multiple of 3. can see how many members each group has way

numel(find(randindex==1)) 

Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -