Author Topic: Slick C Performance challenge  (Read 5310 times)

Brandon

  • Community Member
  • Posts: 37
  • Hero Points: 5
Slick C Performance challenge
« on: November 09, 2006, 04:20:13 PM »
I have a performance issue in a list_buffers replacement that I use where I mark the buffers with a P to indicate they are part of the currently active project.

Let's say the list of open buffers is list A, and the list of project files is list B. So, during dialog instantiation, I need to identify which items in list A are in the list B group.

I currently use a mechanism where I process list A line-by-line and do a regex search against list B.  This works fine, except when the lists get large.  My current project has over 1000 files in it, and I may have 200-300 buffers open at once.  This makes the time to show the dialog a little sluggish.

I would like to hear ideas about how to speed this up.  I had thought about encoding the project files/paths into a numeric digest (a la MD5) and doing a numeric lookup in an array instead of the regex.  It may speed up the search but the time to encode everything may nullify any performance gain.


Graeme

  • Senior Community Member
  • Posts: 2796
  • Hero Points: 347
Re: Slick C Performance challenge
« Reply #1 on: November 10, 2006, 06:26:48 AM »

You could try a Slick C hash-table for the project file list instead of your own MD5 thing

int some_list :[];

void test()
{
    some_list:['a string'] = 1;
    if ( some_list._indexin('a string') )
        message('yes! a string is in the list');
}

_nextel can be used to traverse a hash table.

Coincidentally I've been working on almost the same problem recently but haven't tried the hash-table yet.  You might get even better performance/ memory usage by stripping off the pathnames and making a hash-table list of pathnames, then in the project list, replace the path with the pathname index as a string.  Some hashing functions are more suitable than others and as explained in Wikipedia, even Donald Knuth's hashing function exhibits "particularly poor clustering".  I guess whatever hashing function Slick has, for 200 or 300 lookups it'll be heaps fast enough for the lookup but maybe not fast enough on the table generation or use too much memory.  Wikipedia has a fast hashing function that they say is also excellent in other respects.  I'm guessing the slick one is fairly good though, coz they're performance oriented.

http://en.wikipedia.org/wiki/Hash_table

Graeme