Quantcast
Channel: MATLAB Central Newsreader - tag:"table"
Viewing all articles
Browse latest Browse all 48

Re: efficiency using categorical data

$
0
0
On 04/12/2017 1:21 PM, Bruce Elliott wrote:
> "Yair Altman" wrote in message <ocli7j$k5c$1@newscl01ah.mathworks.com>...
>> Categorical data and tables in general use more memory and are less
>> performant than the corresponding implementation using simple arrays.
...

> Why would categorical data use more memory? In my example, I could
> repeat the char array containing the full path and filename for every
> row of data that came from that file, but I don't see how that could
> take less memory than simply storing an index to a list of filenames.
> In my case, converting five columns to categoricals reduced the size of
> the table from 95.4 MB to 8.8 MB.

What Yair is alluding to is that a table or cell array or structure or
other higher-level abstraction does use memory in addition to the data
in order to provide the amenities of using that abstract data type.
Here's a simple demonstration:

 >> ans=int8(1:5);
 >> whos ans
   Name Size Bytes Class Attributes
   ans 1x5 5 int8
 >> ans=int32(1:5);
 >> whos ans
   Name Size Bytes Class Attributes
   ans 1x5 20 int32
 >> ans=1:5;
 >> whos ans
   Name Size Bytes Class Attributes
   ans 1x5 40 double
 >> ans=categorical(1:5);
 >> whos ans
   Name Size Bytes Class Attributes
   ans 1x5 378 categorical
 >>

So, while turning a longer string representation or other
memory-intensive data into a categorical can save memory, it's not the
same internally as turning it into just an integer array. We see it
took an extra 300+ bytes over a double to build the categorical array
that internally only needs a short integer.

Experimenting with 1 to 5 elements one finds that

mCat=64*N+58

for integer values with no additional properties stored. So, for the
convenience there is a price. If your categorical values are less
memory intensive than that then memory usage actually will increase, not
decrease.

--

Viewing all articles
Browse latest Browse all 48

Trending Articles