SlickEdit Community

SlickEdit Product Discussion => SlickEdit® => Topic started by: rowbearto on December 11, 2019, 08:29:40 pm

Title: core dump 2019 Dec 11
Post by: rowbearto on December 11, 2019, 08:29:40 pm
Got a core dump today. I was editing a python file with some very long lines and had soft wrap turned on - I suspect it is related to the long lines and software. Not like my previous crashes with "Send compile output to window". I included a log of the stdout as I have the special debug build but nothing there.

Look for crash_2019_12_11.tar.xz on support.
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 11, 2019, 08:48:07 pm
Got it to crash a 2nd time. With soft wrap on, go to the end of one of the very long lines and press <ENTER>. That is when it crashed for me the 2nd time.
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 11, 2019, 09:29:07 pm
I'll take a look at it.
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 12, 2019, 07:42:05 pm
Can you upload the vs.log from your_config_dir/24.0.0/logs/vs.log? 
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 12, 2019, 07:48:50 pm
See vs_2019_12_11.log

I have a bunch of debug prints in there from when I was debugging Graeme's xretrace macro
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 12, 2019, 08:52:31 pm
Thanks.  One more question, are there any files named vsstack.* in your /tmp directory?  Assuming it hasn't been rebooted since the crash?

Edit: to clarify, no one was really expecting these logs.  It looks like when the crash happened, it got to a section of our code where it knew something was badly wrong, and it took a stack dump. (not added by my debug additions).  But the way the call was made, it might have made it to a /tmp file rather than the standard vs.log.

I didn't reproduce it with the python file with the long lines, or anything strange under valgrind with it so far.  The xretrace you're running, is the one in the xretrace post, or a changed one?
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 13, 2019, 02:20:54 pm
Yes there was a vsstack file in my /tmp directory - on support look for vsstack.rbresali.

I've also posted the xretrace.zip file corresponding to the exact macro I am using, see: https://community.slickedit.com/index.php/topic,16598.msg67581.html#msg67581

Based on the vsstack file I don't think the crash is due to xretrace, but maybe I'm wrong.

To install xretrace, you need to:

1) make xretrace a subdirectory of your config directory
2) load "xload-macros.e"
3) run "xload-macros".
4) Answer yes to recompile and answer yes for each question to load each macro.
5) Run "xretrace-show-control-panel"
6) Uncheck "retrace delayed start"
7) Restart slickedit
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 13, 2019, 02:27:14 pm
I doubt it too, but since I'm not seeing anything going on so far with your config and files, I'm going to be paranoid and try with this as well. Thanks for the files.
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 13, 2019, 02:35:04 pm
FYI: A short time after I had these crashes (there were quite a few as I kept trying to add some code after the long lines, had to get creative to do what I wanted) when I tried relaunching SlickEdit it would crash right away. I deleted the state file and then all was good, I could start. I wasn't dealing with the long lines in the python anymore, I haven't got a crash since. I was very busy at the time and didn't save my crashing state file.
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 17, 2019, 11:16:32 pm
Yes, the crashes have been from corruption of the statefile and slick-c heap, so it's entirely possible for a bad state file to have been written for one of these. 

I've been looking at xretrace, and I'm no longer so certain it's not playing a role in the crashes.  It's using pointers for/around the list functions, and Slick-C's garbage collector is not a tracing garbage collector. So if you have a pointer that points to a value in an array or hashtable, and you modify the array/hashtable so it needs to dynamically resize, that pointer is left pointing into the Slick-C heap at a location that's going to be storage for something else.  And writing to that location can do a lot of damage, due to the way the values are represented in memory.

The most worrying is in dist_insert() in DLinkList.e.  On line 546, it takes a pointer to an array value in the list's `nodes` member.  On the face of it, it looks ok, because there's a previous call to dlist_get_new_node() before the pointer is taken.  But when there are no nodes on the free list, that function just returns the index of the next node to use, so the possible re-allocation of the nodes array doesn't happen till the array assignment later on at 558.  And then a modification through the pointer on line 559.

There's also a case in xretrace_add_bookmark_for_buffer() where it modifies the buffer_bookmark_list:[] hashtable, while it might be possible there's a global pointer ptr_bookmark_list_for_buffer pointing to one of the values in the hashtable.  I don't consider this one as likely, as I can't see it in practice when that pointer is non-null, and it looks like it can only trigger when the xretrace scrollbar is up.

This isn't an exhaustive list, I haven't looked through all of the code yet. I had looked at dlist_insert() about 3 times before I spotted the problem there.  I need to take a break to get some hotfixes off my backlog, and then I'll go through the rest with fresh eyes, and see if I can find anything else.

Title: Re: core dump 2019 Dec 11
Post by: Graeme on December 18, 2019, 01:39:23 am
I'm not sure I knew that arrays could get re-allocated when they increase in length or that pointers / references could be invalidated.  The statement
np->s_next = nnode;
will presumably write to an invalid location but it also means that  the link to the new node isn't set correctly  - who knows what that could do - it might link to itself and loop forever.
I definitely got crashes on Linux related to xretrace but never on Windows.


Code: [Select]
boolean dlist_insert(dlist_iterator iter, typeless & val)
{
   dlist_node * np;
   if (iter.hndl < 0 || iter.hndl >= iter.listptr->nodes._length() || iter.listptr->nodes[iter.hndl].s_prev < -1) {
      // if the dereference (*) creates a (large) temporary list object here,
      // the push_front will fail!
      return dlist_push_front(*iter.listptr, val);
   }
   int nnode = dlist_get_new_node(*iter.listptr, true);
   if (nnode < 0) {
      return false;
   }
   np = &iter.listptr->nodes[iter.hndl];
   int nextn = np->s_next;
   iter.listptr->nodes[nnode].s_next = nextn;
   if (nextn < 0 || nextn >= iter.listptr->nodes._length()) {
      // there is no (valid) next node so make the new node the tail
      iter.listptr->s_tail = nnode;
   }
   else {
      // link to the next node
      iter.listptr->nodes[nextn].s_prev = nnode;
   }
   // link to the prev node
   iter.listptr->nodes[nnode].s_prev = iter.hndl;
   np->s_next = nnode;
   // copy the value
   iter.listptr->nodes[nnode].s_data = val;
   return true;
}

Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 18, 2019, 02:54:47 am
How do I uninstall xretrace and the whole suite?

I used the xretrace-show-control-panel and checked the "retrace delayed start".

I loaded all the macros in xload-macros, do I just go to each .e file in the directory and do "unload"?
Title: Re: core dump 2019 Dec 11
Post by: Graeme on December 18, 2019, 03:10:49 am
I would suggest calling xretrace_disable from the command line (or from the settings dialog) before unloading all the files.
Title: Re: core dump 2019 Dec 11
Post by: Graeme on December 18, 2019, 03:37:44 am
FYI - xretrace doesn't actually call the problem function (dlist_insert) - directly or indirectly, I will check for other cases when I get time.

I suspect the arrowed line of code below will fix the re-allocation problem


Code: [Select]
static int dlist_get_new_node(dlist & dl, boolean front)
{
   int nnode = dl.free_head;
   if (nnode >= 0) {
      dl.free_head = dl.nodes[nnode].s_next;
      --dl.free_count;
      return nnode;
   }
   else
   {
      // nothing in the free list, see if we can allocate a new node
      nnode = dl.nodes._length();
      if (nnode >= dl.max_nodes) {
         // can't allocate a new one, can we overwrite
         if (!dl.overwrite_f) {
            return -1;
         }
         nnode = front ? dl.s_tail : dl.s_head;
         delink_node(dl, nnode);
      }
      dl.nodes[nnode].s_next = -1;       // <<<<<<<<<<<<<<<<<<<<<<<< bug fix
      return nnode;
   }
   return 0;
}

Title: Re: core dump 2019 Dec 11
Post by: Graeme on December 18, 2019, 04:20:47 am
In case anyone's interested the case number I had for the Linux crash I got was CAS-71751-8841  - but it was too confusing to investigate.
Title: Re: core dump 2019 Dec 11
Post by: patrick on December 18, 2019, 12:14:06 pm
I'll look back at that case and see if that makes more sense given what I've seen with Rowbearto's core dumps.

It's a tough one to track down, I haven't been able to reproduce anything with or without xretrace.  I started looking at xretrace because I realized (a little late) the only reported coredumps on linux for v24 post beta were with it. It may end up being just a coincidence. 
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 18, 2019, 04:26:16 pm
I'm able to reproduce it immediately without xretrace!

I unloaded all xretrace modules and deleted my state file.

Then I load the python file that I sent you, turn on soft wrap (Patrick: Did you enable soft wrap?), go to the very end of line 189 (closing quote), press <ENTER> and crash right away.

Patrick: Make sure you:

1) Use my user.cfg.xml
2) Turn on soft wrap
3) Go to the end of line 189, just after the closing quote, and press <ENTER>

If there are any more files I can provide you let me know.
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 18, 2019, 04:28:13 pm
Patrick:

All my core dumps post beta for v24 were not with xretrace. Only this one was. The first time that I loaded xretrace was on December 8. So all the core dumps I gave you before Dec 8 did not have xretrace.
Title: Re: core dump 2019 Dec 11
Post by: rowbearto on December 18, 2019, 04:32:11 pm
There are 2 other macros that I have had loaded the whole time. I unloaded both of them, deleted my state file and I was able to repro it again. So it is not my macros either.

After deleting the state file, I relaunched SE, went to line 189, turned on soft wrap, went to the end of line 189 after the quote, pressed <ENTER>, and crash right away.
Title: Re: core dump 2019 Dec 11
Post by: Graeme on December 18, 2019, 09:31:53 pm
I was very pleased and impressed that you looked at xretrace and managed to notice an obscure problem.  Any more comments you have are very welcome.


I'll look back at that case and see if that makes more sense given what I've seen with Rowbearto's core dumps.

It's a tough one to track down, I haven't been able to reproduce anything with or without xretrace.  I started looking at xretrace because I realized (a little late) the only reported coredumps on linux for v24 post beta were with it. It may end up being just a coincidence.