Author Topic: Tagging not smart enough  (Read 7984 times)

YordanPavlov

  • Junior Community Member
  • Posts: 2
  • Hero Points: 0
Tagging not smart enough
« on: November 18, 2011, 01:10:21 pm »
I am using SlickEdit Version 15.0.1.3 on Debian Linux.
The problem I face is with the "Go to Definition" and "Go to References" features. It seems to me that they are nothing but a 'grep' through all the tag files with no additional intelligence involved. If I happen to have a very common class member name for example, I will receive results for members with the same name throughout the workspace, despite the obvious irrelevance of most of the results. Some is true for function names, method names and so on. Another example would be having a method name called "Init()", the results of "Go to References" would include all the method names throughout my workspace sharing this name.

It is my belief that an intelligent references system is pretty much the essence of a modern IDE and I do not think this is acceptable.

ehab

  • Senior Community Member
  • Posts: 285
  • Hero Points: 15
  • coding with SE is like playing music
Re: Tagging not smart enough
« Reply #1 on: November 18, 2011, 02:20:25 pm »
can you please provide an example source, i would also like to check it since i am seeing some odd behavior in tag lookup.

JeffB

  • Senior Community Member
  • Posts: 284
  • Hero Points: 13
Re: Tagging not smart enough
« Reply #2 on: November 18, 2011, 04:47:25 pm »
I've wondered about this often, but never asked the question.  When choosing "go to references", SE asks me exactly which variable (or whatever) I want references to, but then returns references to any variable (or whatever) by the same name.  For instance, in C, if I have two structures with the same member named "my_name", selecting "my_name" in one of the structure definitions and choosing "Go to Reference of my_name" will cause SE to prompt me which "my_name" I want references for (which could be inferred in this case, since my cursor is on the "my_name" in a particular structure definition), after choosing the definition in the list, SE then proceeds to show any and every use of "my_name" no matter how it's used or which structure it comes from.  It seems that in many cases the context could be determined, and only the appropriate "my_name" uses displayed in the references list.

I think this is the same behavior described in this thread.

Jeff

YordanPavlov

  • Junior Community Member
  • Posts: 2
  • Hero Points: 0
Re: Tagging not smart enough
« Reply #3 on: November 19, 2011, 04:53:50 pm »
@JeffB
Yes that pretty much sums it. Although sometimes I am not even asked for the symbol I want referenced.

@ehab
Of course I can provide source. Still that implies that it is working correctly for you, right?

jwiede

  • Community Member
  • Posts: 98
  • Hero Points: 12
Re: Tagging not smart enough
« Reply #4 on: November 21, 2011, 10:43:07 pm »
That it asks for the specific instance, and then ignores what the user indicates and provides references to all instances is "less acceptable" than just providing all instances without asking.  SE should be capable of constraining "show references" by either user-selected context/namespace, or through inference from the selected instance. 

At the very least, if it's going to ignore such constraints anyway, then it shouldn't be asking the user to specify which instance matters.

jimbo333

  • Community Member
  • Posts: 58
  • Hero Points: 3
Re: Tagging not smart enough
« Reply #5 on: November 22, 2011, 05:37:00 pm »
Just a thought, but this might go back to legacy C style symbols (pre C89?).  Where all structure variables were in a single scope.  Newer compilers put structure members in a separate scope for each structure.  I wonder if there is a mode setting somewhere in Slickedit to change this behavior.

Tree

  • Community Member
  • Posts: 79
  • Hero Points: 2
Re: Tagging not smart enough
« Reply #6 on: January 10, 2012, 09:26:53 pm »
This is not limited to C. The behavior is the same and if anything more annoying in java.

RaffoPazzo

  • Community Member
  • Posts: 61
  • Hero Points: 2
Re: Tagging not smart enough
« Reply #7 on: January 12, 2012, 08:47:36 am »
I had the same problem and I solved using the "Defs" toolbox by selecting "References" from the pop up menu that appears with the right-click on the symbol you want to look references for.
It works perfectly to me.

JeffB

  • Senior Community Member
  • Posts: 284
  • Hero Points: 13
Re: Tagging not smart enough
« Reply #8 on: January 12, 2012, 03:34:58 pm »
This didn't make a difference for me...seems like this wouldn't be any different than choosing references from an edit window or typing it in manually anyway.

Dennis

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 2603
  • Hero Points: 396
Re: Tagging not smart enough
« Reply #9 on: January 23, 2012, 04:01:36 pm »
Symbol references work by using an inverted file index to compute the set of files that a symbol name appears in.  It then uses the symbol navigation to refine each place the symbol appears and determine if it is a match to the symbol you selected.  If it is not a symbol navigation match, then it is removed from the list of references.  It does this using the same logic that symbol navigation uses to jump to a symbol.  This logic involves evaluating the expression prefix, looking up the type of the symbol under the cursor, and determining the exact class or struct in play (to over-simplify the process).  In the cases where we can't compute exact origin of a symbol, it falls back on logic that just matches the symbol by name.  References will also fall back on that logic when refining the list of symbol instances that match.

As with anything, the results you get depend on the quality of the input you put in.  If your source code is replete with preprocessing, side-effecting header files, and missing or duplicated header files, then you have to expect it to be difficult to parse.

Instead of making assumptions that the symbol navigation system doesn't work, use the fact that SlickEdit prompts you for which symbol to navigate to as an indication that your code is not being parsed correctly, and figure out what is non-standard about your code that trips up the parser.

95% of the time in C and C++ it's going to be preprocessing.  Go to Document > C/C++ Options > C/C++ Preprocessing... to configure things to work around the problem.  SlickEdit does not do full preprocessing of your source code (meaning, it does not follow #include files).  We do it this way for performance, but also for simplicity.  It's not as if following includes would solve everything, because you would still need to provide your include path, including compiler include paths and preprocessor defines, and the tagging would have to guess or skip includes and #if paths that are underspecified anyway.

95% of the time in Java, if things don't parse correctly, it's because you don't have everything you need tagged.  You probably are using jar files, but you have neither the jar files nor the source for the jar files tagged.

Hope this information helps.  As usual, if all else fails, post an example.
« Last Edit: January 23, 2012, 04:04:02 pm by Dennis »

Tree

  • Community Member
  • Posts: 79
  • Hero Points: 2
Re: Tagging not smart enough
« Reply #10 on: January 23, 2012, 04:24:59 pm »
Pedantically simple java example: Go to any class that has a toString() method (which should be most of them), and search for references to that method.
This takes *minutes* on my 8-core Xeon, and produces garbage results. It spends tons of time evaluating tens of thousands of files that may call a method called "toString" but *do not* have any reference to the class I searched from. The same is true (to a lesser but still annoying extent) for any common method name.

This is unbelievably inefficient and completely unusable.

Dennis

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 2603
  • Hero Points: 396
Re: Tagging not smart enough
« Reply #11 on: January 23, 2012, 09:41:01 pm »
Here are some things you might want to consider to have better luck with References.

1) Do not build a symbol cross-reference for your Java compiler tag file.  Tools > Tag Files...

2) Turn on incremental references.  This will simplify the reference search to do one file at a time.  That way it won't have to spend as much time grinding through every single file in your workspace that uses toString.

3) Look more closely at the results you call "garbage".  They are more than likely cases where tagging can not evaluate the symbol reference because there are classes used or imported which are not tagged and thus not known to the tagging engine.  Garbage in, garbage out.

I agree that cases such as toString() are annoying.  There is little we can do about it, without having a completely different system for searching symbol cross-reference information, specific to each language (and compiler, for that matter) we support.

As for finding a more accurate, general purpose solution (since you noted that SlickEdit was searching files that did not have references to the class you searched from).  A source file does not have to have a direct reference to a class in order to use a method in that class.  Take this example.  CCC.java has a reference to AAA.myMethod(), but AAA does not appear anywhere in CCC.java.  You have to trace the inheritance chain to determine that relationship.  SlickEdit does that if you give it complete and correct tagging information.  In this example, I get exactly 3 references to AAA.myMethod(), and exactly 2 references() to XXX.myMethod().

Code: [Select]
// XXX.java
class XXX {
   static int myMethod() { return 0; }
}

// AAA.java
class AAA {
   int myMethod() { return 1; }
}

// BBB.java
class BBB extends AAA {
   int yourMethod() { return XXX.myMethod(); }
}

// CCC.java
class CCC extends BBB {
   int ourMethods() { return myMethod() + yourMethod(); }
}

// DDD.java
class DDD {
  public static void main(String args[]) {
       CCC c = new CCC();
       c.myMethod();
  }
}

That is just one example, and for a very simple language.

Tree

  • Community Member
  • Posts: 79
  • Hero Points: 2
Re: Tagging not smart enough
« Reply #12 on: January 23, 2012, 10:16:39 pm »
1) Our project is an order of magnitude larger than the jdk. Tagging or not tagging the JDK is irrelevant.
2) incremental references makes the normal use case slower. 99% of the time, I want all references.
3) in the toString() example, there are 181 results in my random test. Of these, 33 are actually references to the correct class, 90 are in third party libraries that *cannot possibly* refer to my random test class. A random sampling of the remaining 60 odd results (all marked with "?") show explicit casts, generics, and chained calls, mostly. None of the ones I sampled were actually references to my test class. None were references to classes that were not tagged.

Both performance and accuracy are issues. Personally, I'd take the inaccurate results if they could be delivered in seconds instead of minutes. The OP might prefer accuracy, but I'm sure not at the cost of minutes-long searches. Both issues point at the tagging implementation as being "not smart enough" about the relationships between classes.

Dennis

  • SlickEdit Team Member
  • Senior Community Member
  • *
  • Posts: 2603
  • Hero Points: 396
Re: Tagging not smart enough
« Reply #13 on: January 23, 2012, 10:47:45 pm »
You might want to re-organize how you are tagging things to improve the results you are getting.

1) It isn't an issue of tagging vs. not tagging the JDK.  It's an issue of tagging it with cross-referencing information or tagging it without cross-referencing information.  You are sure to get nowhere fast in Java if you don't tag the JDK at all.

2) Move the 3rd party source to a separate workspace, tagged without references.  Associate this workspace tag file with your primary workspace as an auto-updated tag file.  This way those third-party libraries will not participate in tag references searches.

3) Give me some real examples of those references that fail.  We do have a shortcoming handling chained calls because, well, it's just plain hard to determine the return type of an overloaded function.  We have an option for trying to handle these cases, but it does not perform up to our standards so we have it off by default.  Also, sometimes it's not a matter of just the class being referenced not being tagged, but also all the classes in the derivation chain need to be tagged in some cases, otherwise SlickEdit is left hanging without enough information to prove that what a reference is or is not (so it assumes that it "might be" a reference).

Tree

  • Community Member
  • Posts: 79
  • Hero Points: 2
Re: Tagging not smart enough
« Reply #14 on: January 24, 2012, 01:30:07 am »
1) I have yet to hear a specific definition of what "with references" changes in terms of tagging. Can you provide one?

If it prevents searching for uses *of* code, that's obviously useless. If it prevents searching for references *within* code, that's also useless. I frequently am interested in uses of code within third party code bases. Also, even if I eliminated all the "with references" for third party code, it would not even cut my search space in half. "minutes" times some large fraction is still "minutes."

2) I (and I presume this is standard practice) have separate tag files for each third party package. They are all tagged with references (see 1). Is there some advantage to creating a whole workspace for third party code beyond keeping it in a tag file? I don't think I've ever seen a recommendation to put third party code in a workspace, before...

3) using toString and the jdk as a readily available example:
- several of the Collection classes, such as AbstractCollection show up with references to StringBuffer and/or StringBuilder. This is a common failure pattern.
- java.util.Attributes.writeMain() is an example of a cast
- java.lang.reflect.Method.toGenericString() hits something new... typeparm is discovered, of type Type, which is well-tagged and has no relation to the object I searched on.
- I can't find the Generics case I was looking at before. java.util.Hashtable is a debatable example. I feel we should have the option to not waste time looking for these references, but they are arguably possibly relevant. ALL the other cases listed above are explicit references to unrelated classes.
- believe it or not, while looking for these, I found a reference in an *xml file*! Can you believe it? And it wasn't a tag, it was the middle of a block of text! Wow!

If the option for handling chain calls doesn't "perform" as in it's slow, we may be *very* willing to pay that cost at tagging time, but obviously not at reference search time. The general theme in this thread is we want more "smarts" in the tagging up front to make the reference searching later faster and more accurate.