Author Topic: Speed on Linux again  (Read 13967 times)

greggman

  • Senior Community Member
  • Posts: 280
  • Hero Points: 14
Speed on Linux again
« on: June 01, 2011, 11:43:54 PM »
I'm sorry to start off this way but I'm really frustrated. Slickedit has become un-useful for me lately and support isn't answering, at least here on the forums.

I recently rebooted my machine. Started Slickedit, it said my <prog>.vtg file was corrupt. It took *35 minutes!!!!* before I could start using slickedit. 8-10 minutes until it finishing saying <prog>.vtg file corrupt and 25 minutes to tag during which perf is so slow it's unusable.

Seriously guys, what am I supposed to do? I've been a loyal customer for 18 years or since V1.0  I've got 18 years of slickedit muscle memory. I don't want to have to give up slickedit but some word from you guys seems in order. Are you going to address this? Is it a priority? Will I see a fix in a reasonable amount of time or should I be looking for other solutions?

chrisant

  • Senior Community Member
  • Posts: 1410
  • Hero Points: 131
Re: Speed on Linux again
« Reply #1 on: June 02, 2011, 02:10:14 AM »
35 min is painful, I feel your pain.

Using the forum search, I can see posts from you about speed issues dating back as far as 2006.  They were taken seriously and some improvements and/or hotfixes came out of them.  I've also posted about speed issues a number of times, and I've always felt my feedback was taken seriously and most often led to direct improvements with reasonably short turnaround time.

I only see one thread from you recently about speed, and it was less than a week ago, but maybe I missed some other posts in the forums.

I've always found that the more specific details I can provide, the better the SlickTeam is able to respond in a timely manner.  I haven't seen much detail in the recent speed reports, but I also recognize it's hard to capture "details" when we're talking half a million files (I work in some very large projects as well, anywhere from 200,000 files to 800,000 files).

For me on Windows (I know you're on Linux) SlickEdit 16 is faster than ever at tagging.  I recall some kind of problem that was discovered wrt Linux (I think?) during the 16 beta (I think?) that was slowly down performance in an unexpected way.  Maybe you can find it in the beta forum and check if it looks like it could be related.

Two ideas that might help in the short term:
- Turning off symbol coloring might alleviate some of the speed problems.
- Deleting the .vtg file might help speed up the tagging.

HTH

ScottW, VP of Dev

  • Senior Community Member
  • Posts: 1471
  • Hero Points: 64
Re: Speed on Linux again
« Reply #2 on: June 02, 2011, 02:24:48 PM »
I'm very sorry for your problem. We take all problem reports seriously, and performance issues are one of our top priorities. Much of what we did in v16 was to improve performance. Despite your experience, v16 is faster at tagging for most people. I don't know why it has been so bad for you.

Please remember that the forums are not an official support channel. See: http://community.slickedit.com/index.php?topic=28.0.
Quote
Although our staff will monitor the forums and answer selected questions, it is not our intent to address support requests from within the forum. Support requests should be submitted via www.slickedit.com/supportcase

The forums are for users to help users. We answer items when we can, particularly if they are helpful to other users. But all support issues should be routed through official support channels. I probably confuse things by posting to so many of these. If so, I apologize for creating this confusion.

Can you post a case number so I can look into this one?

Some questions (ignore any questions already covered in the support case):
1) It sounds like you're using background tagging. If so, turn it off. Tools > Options then Editing > Context Tagging. Set "Number of tagging threads" to 0 and the two "Use background thread to..." items to False. Of course, this means you won't be able to do anything until tagging has finished, but it sounds like that's where you are now. However, foreground tagging may avoid the problem that is causing this to be so slow, so it may shorten the time before you can start working. What were your tag times like using v15? Please let us know the difference in time to tag with background tagging on and off.
2) Are your source files stored locally?
3) Are your project and workspace files stored locally?
4) How many files in your workspace/projects? What language?
5) Was this a problem using v15? If not, can you use v15 while we are trying to get to the bottom of this? Are there specific changes in v16 that you need? Obviously the multithreading work wasn't a boon for you.
6) Can you post your Help > About SlickEdit info (redact anything you don't wish to share)?

That's all I can think of for now. We may not be able to fix this until the v16.0.1 release, which is planned for late July. But we'll do our best to get to the bottom of this and send you a fix if possible.

timur

  • Senior Community Member
  • Posts: 204
  • Hero Points: 3
Re: Speed on Linux again
« Reply #3 on: June 02, 2011, 02:51:17 PM »
I have found that the background tagging (which didn't exist in v11, the previous version I used) is very slow and bogs down the system.  I use the command-line tool now for all tagging, although it has its own drawbacks.

ScottW, VP of Dev

  • Senior Community Member
  • Posts: 1471
  • Hero Points: 64
Re: Speed on Linux again
« Reply #4 on: June 02, 2011, 06:07:41 PM »
Background tagging is easily disabled, as described previously. After that, it should be very similar to previous versions.

greggman

  • Senior Community Member
  • Posts: 280
  • Hero Points: 14
Re: Speed on Linux again
« Reply #5 on: June 03, 2011, 06:59:47 PM »
Thank you for responding.

I'm very sorry for your problem. We take all problem reports seriously, and performance issues are one of our top priorities. Much of what we did in v16 was to improve performance. Despite your experience, v16 is faster at tagging for most people. I don't know why it has been so bad for you.

Please remember that the forums are not an official support channel. See: http://community.slickedit.com/index.php?topic=28.0.
Quote
Although our staff will monitor the forums and answer selected questions, it is not our intent to address support requests from within the forum. Support requests should be submitted via www.slickedit.com/supportcase

The forums are for users to help users. We answer items when we can, particularly if they are helpful to other users. But all support issues should be routed through official support channels. I probably confuse things by posting to so many of these. If so, I apologize for creating this confusion.

Right. I guess I just assume you'd be better off if solutions are public. Stuff done through the support channel means you guys have to spend time with each person even if their solution could be public.

Can you post a case number so I can look into this one?

I'll open a case if these don't solve the issue.

Some questions (ignore any questions already covered in the support case):
1) It sounds like you're using background tagging. If so, turn it off. Tools > Options then Editing > Context Tagging. Set "Number of tagging threads" to 0 and the two "Use background thread to..." items to False. Of course, this means you won't be able to do anything until tagging has finished, but it sounds like that's where you are now. However, foreground tagging may avoid the problem that is causing this to be so slow, so it may shorten the time before you can start working. What were your tag times like using v15? Please let us know the difference in time to tag with background tagging on and off.

I'll try that and tell you how it goes

2) Are your source files stored locally?

Yes

3) Are your project and workspace files stored locally?

Yes

4) How many files in your workspace/projects? What language?

40k-50k files, mostly C++. I'm only adding .c, cc, .cpp, .h and .py files to my project. I've sent links to the source before is the project is open source.

5) Was this a problem using v15? If not, can you use v15 while we are trying to get to the bottom of this? Are there specific changes in v16 that you need? Obviously the multithreading work wasn't a boon for you.

Speed has been a problem on all platforms for this project. Wait times of 7 minutes or more were common on Windows each time I synced (which rebuilds the .vcproj files in my project)

A big difference is a new workflow using git. A workflow that is gaining massive traction.

If you are not aware of how git works, Git supports branching in a way that no other version control system does so in git it is most common to switch branches often (several times an hour).  When you switch branches in git it changes a bunch of hardlinks on your files (similar to how OSX Time Machine works if you are familar with that). So in other words. 'git checkout feature1' will nearly instantly switch all the files in your project to the versions you were at when you started working on feature1.  'git checkout feature2' will switch them all to the state needed for feature2. It generally takes git like 0-2 seconds even for  50k files.

What this means for slickedit is that several thousand files in a project might change underneath it several times an hour.

git usage is taking off. Tons of projects are moving to github. Chrome is moving to git. Webkit is moving to git. Processing is on git to name a few.

The point I'm trying to make is slickedit is going to have to work well with this new and increasingly popular workflow and I'm guessing it wasn't really designed with that in mind. When I was using svn or p4 or cvs, files didn't change often. Generally if I worked in multiple things at once I had separate copies of the repo and switched that way. In that old style workflow I'd switch projects to work on a different feature. c:\work\repo_feature1\project.vsj vs c:\work\repo_feature2\project.vsj.  The files in each were relatively stable.  In git though I'm always in c:\work\repo. When I want to work on feature 1 I type 'git checkout feature1' and the files in c:\work\repo magically change to the state needed for feature 1. When I want to work on feature 2 I type 'git checkout feature2'  and again the files all magically change back to the state needed for feature 2.

That means slickedit is now noticing all these files changed and starts tagging. That 7 minute wait in Windows which only happened each time I synced a particular repo now happens several times an hour and tagging being slower on linux makes it even worse. I can switch tagging off but I rely on tagging. Even if I make it not tag in the background the git workflow means the long wait comes up really often.

If you can find the source of the poor performance of tagging in linux and find out why while it is tagging the editor's response is so slow that would probably be the biggest help.

6) Can you post your Help > About SlickEdit info (redact anything you don't wish to share)?

Code: [Select]
SlickEdit 2011 (v16.0.0.6 64-bit)

Serial number: xxxxxxx
Licensed number of users: Single user
License file: /usr/local/google/gman/slickedit/16.0.0.6/bin/slickedit.lic

Build Date: May 05, 2011
Emulation: Brief

OS: Linux
OS Version: Ubuntu 10.04.1 LTS
Kernel Level: 2.6.32-gg465-generic
Build Version: #gg465-Ubuntu SMP Mon Apr 11 05:52:28 PDT 2011
Processor Architecture: x86_64

X Server Vendor: The X.Org Foundation
Memory: 47% Load, 4880MB/10178MB Virtual
Shell Info: /usr/local/google/gman/slickedit/16.0.0.6/bin/secsh -i
Screen Size: 2240 x 1600, 2240 x 1600

Project Type: Gnuc
Language: .cc (C/C++)

Installation Directory: /usr/local/google/gman/slickedit/16.0.0.6/
Configuration Directory: /usr/local/google/gman/.slickedit/16.0.0/

Hotfixes:
/usr/local/google/gman/.slickedit/16.0.0/hotfixes/hotfix_se1600_1_cumulative.zip (Revision: 1)


That's all I can think of for now. We may not be able to fix this until the v16.0.1 release, which is planned for late July. But we'll do our best to get to the bottom of this and send you a fix if possible.


One thing I did notice, the configuration path was on the network. (~/.slickedit).  I've moved it to be to be local. I'll see how much that helps.

Thank you. I'm sorry for being frustrated. I'm one of  your biggest fans. I've gotten others here to use slickedit and I'm also one who realizes the value in paying for something even when inferior but free alternatives exist.

ScottW, VP of Dev

  • Senior Community Member
  • Posts: 1471
  • Hero Points: 64
Re: Speed on Linux again
« Reply #6 on: June 03, 2011, 07:23:48 PM »
You are right to be frustrated. We have no tolerance for bad performance and you shouldn't either!

The git workflow poses a major challenge for us. Tagging only works if it knows where the symbols are defined. If the location of the file changes when you switch branches, I can't think how we could handle that except to retag. I'm amazed that git can update that many files in a couple seconds. I'm curious what's going on there. We'll have to look into this.

Having your config on a local drive is very important, too. Good catch! I didn't think to ask about that. I would think that would affect library tagging more than the speed of tagging your workspace, though. The tag file for the workspace is stored in the same directory as the workspace file.

Is this a wildcard project? That could be a problem. Maybe the wildcard lookup and the tagging are fighting for time. Also, you might try closing the Projects tool window. That has some performance problems associated with it.

We do like users to be able to see other answers, which is why I try to post to as many of these as I can. The problem is that the forum software has no tracking in it. I can't tell which items are open or resolved. I can't see which items have someone working on them. So, things are very likely to fall through the cracks on the forums as we get pulled into other activities. The Product Support team has real tracking software with workflows built in for license verification and case histories.

If you are on maintenance and support, please contact support and open a case for this. Reference this thread on the forums so they can see what we've already covered. If you aren't on maintenance and support, contact them anyway and tell them that I wanted them to handle this case. I'll tell them to watch for it. We need to get to the bottom of this performance issue.

chrisant

  • Senior Community Member
  • Posts: 1410
  • Hero Points: 131
Re: Speed on Linux again
« Reply #7 on: June 03, 2011, 07:50:11 PM »
@greggman, I recall that you uncovered some performance problems in Python tagging, specifically.  It might be interesting to try excluding the .py files temporarily, and see how that affects SE's tagging speed.  For example, if the problem is actually in the Python tagging parser and not in the outer loop(s) of SE's tagging engine, this could help focus SlickTeam's investigation.

greggman

  • Senior Community Member
  • Posts: 280
  • Hero Points: 14
Re: Speed on Linux again
« Reply #8 on: June 03, 2011, 08:37:50 PM »
The git workflow poses a major challenge for us. Tagging only works if it knows where the symbols are defined. If the location of the file changes when you switch branches, I can't think how we could handle that except to retag. I'm amazed that git can update that many files in a couple seconds. I'm curious what's going on there. We'll have to look into this.

AFAIK git works using hardlinks which means when it's switching versions all it's doing is changing links to files. No copies. In unix lingo 'ln repo_data/versions/file-version1.cc workspace/file.cc'

If you want to learn git I suggest these 2 links

"The git parable" explains what's really happening in plain and simple terms. It's an easy read
http://tom.preston-werner.com/2009/05/19/the-git-parable.html


The ProGit book
http://progit.org/book/

The first chapter explains the basic different between old systems p4, cvs, svn, etc... and new ones hg, git, etc..


Clark

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 6823
  • Hero Points: 526
Re: Speed on Linux again
« Reply #9 on: June 06, 2011, 01:18:08 PM »
Have you tried putting the SlickEdit workspace and tag file in the repository? This hopefully would allow the dates in the tag file to match those on disk when you do a checkout which switches all files.

HaveF

  • Junior Community Member
  • Posts: 3
  • Hero Points: 0
Re: Speed on Linux again
« Reply #10 on: June 09, 2011, 01:45:02 AM »
The git workflow poses a major challenge for us. Tagging only works if it knows where the symbols are defined. If the location of the file changes when you switch branches, I can't think how we could handle that except to retag. I'm amazed that git can update that many files in a couple seconds. I'm curious what's going on there. We'll have to look into this.

It's a interesting problem.
I have a idea about retag the changed file instead of __all__ the change files in the git related project, that is:

1. slickedit should remember the current snapshot id(SHA-1 value) by git command or parse the related file.
$ cat .git/HEAD
ref: refs/heads/master
$ cat .git/refs/heads/master
cac0cab538b970a37ea1e769cbbde608743bc96d

2. after users commit their code, or checkout to another branch, slickedit should remember the newer snapshot id.
such as:
1a410efbd13591db07496601ebc7a059dd55cfe9

3. then, slickedit should use git command(or find how this git command works in git's source code) to find the changed file.
$ git diff --name-status cac0cab538b970a37ea1e769cbbde608743bc96d 1a410efbd13591db07496601ebc7a059dd55cfe9
M       .description
D       .gitattributes

4. retag the changed file.

chrisant

  • Senior Community Member
  • Posts: 1410
  • Hero Points: 131
Re: Speed on Linux again
« Reply #11 on: June 09, 2011, 09:53:38 AM »
If git is touching files that it knows haven't actually changed, that would be a performance flaw in git that should be addressed in git itself (it would be such a silly oversight that it's hard for me to believe it exists).

But greggman said his project has 40k to 50k files, and "several thousand files in a project might change underneath it several times an hour".  Which suggests that git is only touching the files it knows changed.  If that's so, then it would only further hurt performance for SE to confirm with git that the changed files were changed.

The issue seems to be that it takes an uncomfortably long to tag the thousands of files that really changed.  Clark's suggestion seems good:  It seems like having the tag db checked into git would allow git to automatically solve the very problem it created.

Alternatively either SE could introduce revision history directly into the tag db (seems complex and impractical, and has potential for unbounded growth), or SE could make tagging faster -- but that's not a scalable solution (if 2000 files takes too long, and performance is improved by 10x, then it just means 20,000 files takes too long).  So having the tag db in git seems like a better solution to the git problem.

atrens

  • Community Member
  • Posts: 6
  • Hero Points: 0
Re: Speed on Linux again
« Reply #12 on: June 16, 2011, 03:48:35 PM »
If git is touching files that it knows haven't actually changed, that would be a performance flaw in git that should be addressed in git itself (it would be such a silly oversight that it's hard for me to believe it exists).

But greggman said his project has 40k to 50k files, and "several thousand files in a project might change underneath it several times an hour".  Which suggests that git is only touching the files it knows changed.  If that's so, then it would only further hurt performance for SE to confirm with git that the changed files were changed.

The issue seems to be that it takes an uncomfortably long to tag the thousands of files that really changed.  Clark's suggestion seems good:  It seems like having the tag db checked into git would allow git to automatically solve the very problem it created.

Alternatively either SE could introduce revision history directly into the tag db (seems complex and impractical, and has potential for unbounded growth), or SE could make tagging faster -- but that's not a scalable solution (if 2000 files takes too long, and performance is improved by 10x, then it just means 20,000 files takes too long).  So having the tag db in git seems like a better solution to the git problem.

Not sure if having a huge tag file checked into git would actually help. I think it would just slow down git.  Also slickedit would need to worry about the file changing underneath it, which I think would be hard problem to solve.


In my case I normally deal with around 65k files, and my tag tile is over 400MB -

-rw-r--r--  1 atrens  wheel  463478784 Jun 16 10:33 trunk.git.vtg

that's only about 1% of the sandbox size, but still getting a bit unwieldy ..

Not sure if splitting up the tag file helps either, unless you could predict which files were most likely to change and group them together.

I think that the only tractable option would be for slickedit to get a list of files changed by the last git operation - perhaps git could dump this someplace convenient as it's actually performing the operation and then slickedit could re-tag only those files. Even better, if a slickedit thread were to be fed the file names as git completed processing them it could follow along and take advantage of the files still being in the buffer cache.

I guess that this means changes to git, but pretty simple changes I'd think. For robustness git could even be taught to spew xml-ized output when passed a special flag.

--Andrew







chrisant

  • Senior Community Member
  • Posts: 1410
  • Hero Points: 131
Re: Speed on Linux again
« Reply #13 on: June 16, 2011, 05:37:01 PM »
Not sure if having a huge tag file checked into git would actually help. I think it would just slow down git.  Also slickedit would need to worry about the file changing underneath it, which I think would be hard problem to solve.
SE locks the tag file when in use; git wouldn't be able to replace the tag file while it's in use, so that shouldn't be a problem.

Yes, checking in the tag file would make submission slower in git, but only when the tag file is submitted (not during sync/repoint, since apparently git doesn't transfer content, it merely changes hard links).  It's a pragmatic trade off:  take a perf hit during sync, or take a perf hit after sync -- which one is less impactful?  It sounds like there's little control over the frequency of sync; but the frequency of submission of the tag file can be fully controlled by the user, for example 2x per week, so the perf hit after sync is never more than 1/2 a week's worth of changes.

I think that the only tractable option would be for slickedit to get a list of files changed by the last git operation - perhaps git could dump this someplace convenient as it's actually performing the operation and then slickedit could re-tag only those files. Even better, if a slickedit thread were to be fed the file names as git completed processing them it could follow along and take advantage of the files still being in the buffer cache.
That assumes that the perf cost is coming from detecting the changed files.  Greggman already stated that only the changed files are being retagged, so that could avoid doing a dir scan of 40k files, but the tagging cost is the same.  Intuitively I'd expect the cost of retagging 3000 files to be much higher than the cost of doing a directory scan over 40,000 files, but I could be wrong... (especially since I use solid state drives these days so I don't experience seek time delays or degradation from disk cache misses.  ;))

atrens

  • Community Member
  • Posts: 6
  • Hero Points: 0
Re: Speed on Linux again
« Reply #14 on: June 16, 2011, 06:35:15 PM »
Not sure if having a huge tag file checked into git would actually help. I think it would just slow down git.  Also slickedit would need to worry about the file changing underneath it, which I think would be hard problem to solve.
SE locks the tag file when in use; git wouldn't be able to replace the tag file while it's in use, so that shouldn't be a problem.

Not a problem for SE, it would be a problem for git however. :)

Yes, checking in the tag file would make submission slower in git, but only when the tag file is submitted (not during sync/repoint, since apparently git doesn't transfer content, it merely changes hard links).  It's a pragmatic trade off:  take a perf hit during sync, or take a perf hit after sync -- which one is less impactful?  It sounds like there's little control over the frequency of sync; but the frequency of submission of the tag file can be fully controlled by the user, for example 2x per week, so the perf hit after sync is never more than 1/2 a week's worth of changes.

I suppose the pain would be as you say on check in - would need to try it out to see how much pain. :) One thing to note, one can't do a re-point with changed files not checked in, so  you'd almost always need to check in your tag file before doing a re-point.

I think that the only tractable option would be for slickedit to get a list of files changed by the last git operation - perhaps git could dump this someplace convenient as it's actually performing the operation and then slickedit could re-tag only those files. Even better, if a slickedit thread were to be fed the file names as git completed processing them it could follow along and take advantage of the files still being in the buffer cache.
That assumes that the perf cost is coming from detecting the changed files.  Greggman already stated that only the changed files are being retagged, so that could avoid doing a dir scan of 40k files, but the tagging cost is the same.  Intuitively I'd expect the cost of retagging 3000 files to be much higher than the cost of doing a directory scan over 40,000 files, but I could be wrong... (especially since I use solid state drives these days so I don't experience seek time delays or degradation from disk cache misses.  ;))

I think that if the file is already in the buffer cache as a result of a git operation on it, then a quick second scan of it shouldn't hit the disk at all. Also, intuitively I think that If you're walking through all of the 40k+ files looking for changed ones that's re-doing a lot of work that git just did.

Also to consider, depending on the git operation, you're not guaranteed that those files that you're pulling in haven't changed since you last looked at them, and so would need to be re-indexed anyways.  Of course, git would know this - I think that it generates a unique hash per file based on something - contents likely I think.  If you tracked those hashes you could be even smarter and use that to whittle down the list that would need re-indexing.

--Andrew