Author Topic: Tagging, VTGs and Tagging Cache (Read 8836 times)

MartyL · « **on:** August 28, 2007, 05:08:25 PM »

I have to say, I really am impressed with how effectively the tagging engine works. I've been running some stress tests to see just what VSE and the tagging engine is capable of, but I do have a couple of questions with what it is actually doing.

My goal in these efforts is to troubleshoot a problem in a project. The project is around 2300 source files and taking up to three hours to complete the tagging process. I'm in the dark about what is contained in those files, but I imagine it can't be worse than what I'm putting the tagging engine through. I just completed a run of 10,000 files with about 10 elements (to be tagged) in each file. It took less than a minute. Now I'm in the middle of running 100,000 files with 20 elements each. 16,000,000 lines of source code. Admittedly, it is taking a might bit longer, but it's still going to finish in under 20 minutes (at it's current rate).

What I'm interested in knowing is what exactly the Tagging Cache does. It was my theory that a larger cache size would vastly increase performance on largish tagging runs, but I've been running the stress tests on a cache size of 512KB (rather than the default 32MB) with little to no noticeable difference. I have not tried the 100,000 files on the larger cache, but I don't expect to see a large change in time. What is the purpose of the Tagging Cache? Is it merely a paging file for if the tagging engine exceeds available memory?

Also, is there any other way to display the contents of a VTG file other than the listvtg tool? If not, does the listvtg tool accept any parameters for output to a file?

MartyL · « **Reply #1 on:** August 28, 2007, 05:28:12 PM »

Answered my own question about the output parameter. Totally slipped my mind that the command prompt will do it for me.

StephenW · « **Reply #2 on:** August 28, 2007, 09:21:09 PM »

Tagging, like most things in SlickEdit, is not comprehensively documented. I guess the programmers already have too much to do just creating all the nice features for us to be able to do that too. However, a fair amount of information can be gleaned from the macro files. Take a look at tags.e, especially the comments at the top, and tagsdb.sh, which has all the prototypes for accessing the tags database and gives some hints as to how the database works. But as far as I can tell, there is no information on how the database works internally, that would let you work out the bottlenecks. Is it a home grown database engine, or some commercial one? What sort of indexes does it use? Is there a "big brother" version of the engine that could cope better with huge projects? And are the bottlenecks in the database engine, or before things even get there?

From some of the forum posts, it would appear that some people with huge projects do their tagging as overnight automatic processes, due to the sort of performance problems you are seeing.

As to the cache, the feeling I have always had about it was that it was used for read accesses only, with writes going directly to the database. But that is only a guess.

MartyL · « **Reply #3 on:** September 07, 2007, 06:12:28 PM »

The problem had to do with how my language handled includes. A similar project, in most languages, will not generate references from include files.

Author Topic: Tagging, VTGs and Tagging Cache (Read 8836 times)

MartyL

Tagging, VTGs and Tagging Cache

MartyL

Re: Tagging, VTGs and Tagging Cache

StephenW

Re: Tagging, VTGs and Tagging Cache

MartyL

Re: Tagging, VTGs and Tagging Cache