**** BEGIN LOGGING AT Sun Jun 11 09:03:03 2006 Jun 11 09:03:03 * Now talking on #gsv Jun 11 09:03:07 hi there Jun 11 09:03:08 hi Jun 11 09:03:18 I'd like to speak a bit about SoC Jun 11 09:03:29 that's cool :) Jun 11 09:03:35 can you update me on the progresses? Jun 11 09:03:48 so we can take some decision about next steps Jun 11 09:04:29 hi muntyan Jun 11 09:04:59 well, no progress really Jun 11 09:05:40 i didn't have time yet to do what we decided to do Jun 11 09:05:45 hmm... plans, thoughts, etc. Jun 11 09:05:56 the sorting + processing changes that fall into the batch Jun 11 09:06:12 um, did you read our conversation with pbor and barisione? Jun 11 09:06:19 no Jun 11 09:06:23 oh, ok Jun 11 09:06:27 I'm going to read it now Jun 11 09:06:35 then, the plan and stuff is: Jun 11 09:06:39 pbor: is it already on l.g.o? Jun 11 09:07:23 attaching Jun 11 09:07:28 there is a problem with highlighting which seems (seems, nobody knows really why it happens) to occur because sorting the modifications queue is not enough Jun 11 09:07:38 so the plan is: Jun 11 09:07:59 http://live.gnome.org/GtkSourceView/SummerOfCode2006?action=AttachFile&do=view&target=2006-06-07.txt Jun 11 09:08:00 is it me or l.g.o has broken CSS today? Jun 11 09:08:24 make the function which processes a batch take all the modifications from queue which intersect with the batch Jun 11 09:08:37 then see what happens - whether the problem is fixed or not Jun 11 09:09:03 if the problem is fixed, then it means highlighting basically works (hopefully), and we can think of speed issues and whatnot Jun 11 09:09:33 i guess it wasn't really clear :) Jun 11 09:09:43 paolo: how much do you know about how the engine works? Jun 11 09:10:02 little, so I need details to understand the problems Jun 11 09:10:25 * paolo is reading the old IRC log Jun 11 09:10:25 okay, i'll try to explain it Jun 11 09:11:01 what engine does is: 1) it analyzes the buffer from the beginning to the end; 2) it updates syntax after buffer changes Jun 11 09:11:17 1) is done by processing batch by batch, in an obvius way Jun 11 09:11:21 now, about modifications: Jun 11 09:12:01 when text is inserted or deleted, engine records the modification as offset where it happened, and the length of text inserted/removed Jun 11 09:12:11 then it puts this modification into the queue Jun 11 09:12:47 then it queues updating (or processes modification immediately, but it's irrelevant) Jun 11 09:13:30 idle_worker (for simplicity assume it's always idle_worker who does the analysis and everything) does the following: Jun 11 09:14:06 it peeks the first modification from the queue, and updates syntax in a batch which includes changed region Jun 11 09:14:34 if it can't merge old and new syntax regions in this one batch, it analyzes the remainder of buffer as it does it with new buffer Jun 11 09:15:01 if there are no modifications in the queue, it processes not analyzed remainder of the buffer Jun 11 09:15:14 so, there are bunch of problems here: Jun 11 09:16:28 1) it needs to sort modifications and update them. for example, if you delete text at position 100, and then delete text at position 10, then the at-pos-10 modification must be processed first, and the second modification should have offset corrected (it was 100, but after second deletion it should be 100-(how many chars deleted)) Jun 11 09:17:29 2) what seems to be the problem now: when engine processes a modification, it takes a big batch, and potentially it changes syntax in whole batch, so the changed area may include modifications which are still in the queue Jun 11 09:17:54 the plan for the nearest future is to fix 2) and see what happen Jun 11 09:17:59 the end. Jun 11 09:18:00 :) Jun 11 09:18:37 first question: how are batches computed? What is a batch in this context? Jun 11 09:18:46 4000 chars, roughly Jun 11 09:18:56 ok Jun 11 09:19:05 it calculates the length according to how fast it processed previous batches, but that's details Jun 11 09:19:25 yep Jun 11 09:20:33 you said the whole buffer is analyzed, batch by batch? Is this a sync operation? Jun 11 09:20:51 no, idle_worker processes one batch Jun 11 09:21:11 what does it happens if a change is queued before the whole buffer has been analyzed for the first time? Jun 11 09:21:37 it keeps worker_last_offset - how much of the buffer has been analyzed Jun 11 09:21:49 if the modification is after last_offset, it's just ignored Jun 11 09:22:09 what does it happen if I'm going to "show" a part of the buffer that still need to be analyzed? Jun 11 09:22:54 if it's before, then the modification is processed; if it could merge old and new in one batch, it processes the region after last_offset as usual (in next idle run); if not, then it sets last_offset to the end of the batch, and then proceeds as usual Jun 11 09:23:50 when part of buffer is exposed, then it's asked to apply tags in the exposed area; if it's not analyzed yet, there are no tags Jun 11 09:24:48 i looked at the sources, and: Jun 11 09:25:44 it checks last_offset. if exposed area is before last_offset, it applied tags; if not, it records the region to apply tags later, then it will be analyzed Jun 11 09:26:25 yes, the code there is pretty much what current sourceview does as far as I can tell Jun 11 09:26:39 (except for some details I don't like) Jun 11 09:26:44 also, when it updates syntax somewhere, it emits signal so that view can queue redraw in the updated area; then expose will ask to apply correct tags Jun 11 09:26:49 seems to be correct Jun 11 09:26:54 how fast is the first analysis of the whole buffer (for example on a 10000 lines C file) Jun 11 09:27:44 good question Jun 11 09:28:08 worker_last_offset is the last analyzed offset, right? Jun 11 09:28:19 yep Jun 11 09:29:01 it looks "fast enough", i didn't measure it Jun 11 09:29:11 the analisys of the whole doc on open seems of acceptable performance to me too Jun 11 09:30:10 one of the issues with the queue instead is that if you have something like d'n'd, an expose event forces to apply tags between the remove and the add part of the dnd Jun 11 09:30:21 which leads to flicker Jun 11 09:30:41 I was thinking about making a very fast first analysis that only splits the document in first level contexts and the re-analyze on demands the current context before displaying it to apply the correct tags Jun 11 09:30:44 pbor: no, if you process more than one modification at a time Jun 11 09:30:57 this is what's needed to be done Jun 11 09:31:19 yes, I am just exposing one of the issues there are now Jun 11 09:31:26 didn't say it's unfixable Jun 11 09:31:27 :) Jun 11 09:31:40 paolo: that would work only if first-level contexts don't need children to know where they end Jun 11 09:31:49 this is one of things i wanted to talk about Jun 11 09:32:23 hmm... what do you mean? Jun 11 09:33:36 1) child context may decide to terminate its parent Jun 11 09:34:09 2) it seems to be nicer if many rules may decide when to terminate context Jun 11 09:34:28 at the moment contexts have start and end conditions (regexes or chars, or something) Jun 11 09:34:39 i think they should have more than that Jun 11 09:34:54 e.g. here i have a lang file which has #pop#pop Jun 11 09:35:12 it means it terminates itself and parent Jun 11 09:35:19 did you understand something? Jun 11 09:36:04 yep, but can you report a specific use case for this? Jun 11 09:36:29 Jun 11 09:36:29 Jun 11 09:36:29 Jun 11 09:36:47 some crazy thing from kate Jun 11 09:36:51 's bash lang file Jun 11 09:38:03 Is it so important to support this feature? Are there other ways to support the same feature? Jun 11 09:38:25 i have no idea Jun 11 09:38:37 the .lang files by barisione support this feature? Jun 11 09:38:44 but i don't like that contexts have end attribute Jun 11 09:38:52 paolo: the engine doesn't know about such a thing Jun 11 09:39:10 context has end which is defined by regex, and that's it Jun 11 09:39:46 besides, what happens when context and its child are both terminated by same thing, e.g. end of line? Jun 11 09:40:16 like Jun 11 09:40:19 #ewfwe //ewfwef Jun 11 09:40:22 in a C file Jun 11 09:40:45 (not with gtksourceview's c.lang, it doesn't know that preprocessor directive is whole line, not only word after #) Jun 11 09:40:54 it seems the .lang files by barisione do not require an "end" attribute for contexts Jun 11 09:41:16 they don't reuiqre it for "simple" contexts Jun 11 09:41:30 those simple contexts are just regions matched by certain regex Jun 11 09:41:49 but container contexts do require end attribute Jun 11 09:42:00 (at least in my reading code and lang files :) Jun 11 09:42:24 ok Jun 11 09:43:00 (this is a question to ask to barisione, please collect all the open questions in a mail to send later) Jun 11 09:43:18 ok Jun 11 09:43:21 mmm... isn't the end regex of a container context done by ORing all the possible ends of the child context which terminate the parent? Jun 11 09:44:09 no Jun 11 09:44:15 it can't work, can it? Jun 11 09:44:33 you can't just match anything, contexts may jump like crazy Jun 11 09:44:33 well, yes Jun 11 09:44:58 for this to work you need *everything* be controlled by end regexes Jun 11 09:45:05 e.g. ends-at-line-end breaks it Jun 11 09:45:07 i think Jun 11 09:45:09 but you need to use a stack of context also during the fast analysis Jun 11 09:45:55 ends-at-line-end it is one of the ORed possible ends Jun 11 09:46:24 but child may opt not to terminate there Jun 11 09:46:28 while parent may temrinate Jun 11 09:46:36 it's how trailing \ is done Jun 11 09:46:39 ok Jun 11 09:47:12 the parent (preprocessor) ends at line-end, but it has child context that consumes end of line, so the parent won't see it Jun 11 09:47:14 we can analyse this problem later Jun 11 09:47:24 let us return to the current problems Jun 11 09:47:39 it seems to me the mother of all the problems is the queue of changes Jun 11 09:47:50 (regular expressions can encapsulate arbitrarily complex thing, but noone will understand it) Jun 11 09:47:54 I have a question... do we really need it? Jun 11 09:48:06 well, yes if we don't want to touch contexts tree Jun 11 09:48:17 context tree is the mother of the problems Jun 11 09:48:29 it keeps offsets in the nodes Jun 11 09:48:31 what is the "context tree"? Jun 11 09:48:39 ok Jun 11 09:48:46 stupid question Jun 11 09:48:56 not really stupid Jun 11 09:48:59 so the problem is not the context tree per se Jun 11 09:49:25 well, it depends on how you look at it Jun 11 09:49:29 let me explain Jun 11 09:49:40 there are two things in here: Jun 11 09:50:08 1) the 'state machine', the tree of contexts which represents how contexts may be there, how parents include children and such Jun 11 09:50:42 2) the sequence of segments of text, where each segment corresponds to some state of that state machine, or to some node of the contexts tree Jun 11 09:50:58 in the engine these two are one thing, the "contexts tree" Jun 11 09:51:25 if it's right, then the queue is the problem, but it can't be removed Jun 11 09:51:36 if it's not right, then the contexts tree is the problem Jun 11 09:51:37 :) Jun 11 09:52:54 the *real* problem is that the engine has no idea about the fact that buffer can be modified many times and in any order; the rest is implementation details ;) Jun 11 09:53:11 and we are free to solve it in any way we like Jun 11 09:53:31 (saying this just to make sure noone is afraid of removeing queue or something) Jun 11 09:54:09 well, it seems to me we need a contexts tree (or syntax tree) Jun 11 09:54:17 do all agree on this? Jun 11 09:55:11 and if I have understand the problem, the current algorithms does not support multiple changes Jun 11 09:55:25 but only single changes, right? Jun 11 09:55:46 well, I guess there are other data structures which could hold the syntax informations Jun 11 09:56:10 we do need one tree, for sure Jun 11 09:56:16 in particular all the regex work is based on analyzing lines Jun 11 09:56:28 while the context tree has no idea about lines Jun 11 09:56:35 but we don't necessarily need to have syntax segments in the contexts tree Jun 11 09:57:13 and yes, current algorithms do not really support multiple changes Jun 11 09:57:39 ok, explain me how do you see the prefect context tree Jun 11 09:58:01 I mean we have some code, it is not working for several reasons Jun 11 09:58:07 i wish i knew what perfect context tree is :) Jun 11 09:58:14 i guess i'd start implementing it Jun 11 09:58:19 there are not millions of lines Jun 11 09:58:34 so we can rewrite part of it without too much worrying Jun 11 09:58:47 i can describe what i have in my engine: Jun 11 09:58:52 ok Jun 11 09:59:21 1) the contexts tree, which is just a tree of contexts and their children, nodes are added when corresponding context is met in the buffer Jun 11 09:59:22 may be explain also how it is different from the gsv one Jun 11 10:00:06 2) the text part is kept in a separate structure: each line has list of segments, each segment keeps pointer to a node of contexts tree Jun 11 10:00:45 this makes it easy to invalidate parts of buffer - just erase segments on the lines Jun 11 10:01:06 it's still not convenient for modifications inside of the line though Jun 11 10:01:20 err, this is a bit not what i am describing :) Jun 11 10:01:52 well, if you invalidate part of the buffer you also need to remove nodes from the tree, right? Jun 11 10:01:54 so, the main thing is: contexts are the states in the state machine; segments in texts are segments which "are in this state" Jun 11 10:02:01 paolo: no Jun 11 10:02:08 why not? Jun 11 10:02:18 i have one node for top-level comments, for example Jun 11 10:02:29 regardless of how many comments are in the buffer Jun 11 10:02:46 i have many text segments that point to "top-level C comment" node Jun 11 10:03:17 basically your tree describes the possible state transitions for a language Jun 11 10:04:01 right Jun 11 10:04:17 (not all possible, only those that actually occur in the text) Jun 11 10:04:25 it contains a subset of the state machine for the specific language Jun 11 10:04:31 right Jun 11 10:05:01 what's good in this (or bad in gtksourceview's) is the following: Jun 11 10:05:17 in mine, i can merge two segments simply by comparing the nodes Jun 11 10:05:32 in gtksourceview's, it needs nodes to be the same with all parents Jun 11 10:06:17 not sure it's really very good, but it is simple Jun 11 10:06:54 there is a potential problem in gtksourceview, related to this: Jun 11 10:07:24 when text is modified, the engine takes a batch of text, and reanalyzes it, trying to merge old and new nodes Jun 11 10:07:44 if it can't do it in this batch, it simply reanalyzes whole remainder of the buffer Jun 11 10:08:01 even if contexts can be merged in the next batch Jun 11 10:08:23 hmmm... actually I think we need what is normally called a "concrete syntax tree" Jun 11 10:08:32 what's that? Jun 11 10:08:36 see http://en.wikipedia.org/wiki/Concrete_syntax_tree Jun 11 10:09:24 it looks like the tree in gtksourcecontextengine Jun 11 10:09:39 yep, that's what the context tree is right now Jun 11 10:09:50 the problem is that it's not modification-friendly Jun 11 10:10:11 that is a tree that represents the current state of the whole buffer Jun 11 10:10:12 just coloring is only half of work; we need to recolor as well :) Jun 11 10:11:45 building a tree like muntyan's one would also give us the chance to share it across files Jun 11 10:11:48 hm, does colorer support modifications? Jun 11 10:11:56 that is one tree for all C files open Jun 11 10:12:12 muntyan: yes, it's the engine used in eclipse I think Jun 11 10:13:10 colorer should support modifications Jun 11 10:13:14 it is not used in eclipse Jun 11 10:13:44 when barisione started his work we planned to rewrite colorer in a glib friendly way Jun 11 10:14:01 but it seems it was not a good solution Jun 11 10:14:08 I don't remember the details Jun 11 10:15:19 paolo: http://colorer.sourceforge.net/sshots/e2.png Jun 11 10:15:30 paolo: though maybe it's just an additional plugin Jun 11 10:15:39 there is a plugin for ecplise based on colorer Jun 11 10:17:03 muntyan: can you tell us more about the other data structure you use? that is the one that holds the segments... Jun 11 10:17:22 it is not clear to me why a tree like the one I said should not work Jun 11 10:17:28 muntyan: I know it's a btree, but I don't totally grok why Jun 11 10:17:30 ? Jun 11 10:17:42 paolo: it's already there, and it doesn't work :) Jun 11 10:17:55 paolo: the tree itself will work, why not? but it needs some stuff to support it Jun 11 10:18:10 yep, I agree on this Jun 11 10:18:36 well, it could also work without additional data structures Jun 11 10:18:40 pbor: btree because it has easy insertion and deletion Jun 11 10:18:50 paolo: how to deal woth modifications? Jun 11 10:18:54 wih Jun 11 10:18:56 with Jun 11 10:19:22 invalidating part of the tree when a modification happens Jun 11 10:19:37 don't forget about tags Jun 11 10:19:54 'tags are applied in the buffer, and they need to be removed when new highlighting is ready Jun 11 10:20:00 muntyan: I mean, what do you cache in the nodes? for instance GtkTextBTree allows to know if there is a tag without checking all the leafs/segments Jun 11 10:20:18 at the moment tree keeps offsets, and with modifications queue you always can tell where tags are applied Jun 11 10:20:41 pbor: let me look at it :) Jun 11 10:21:30 I don't think tags are a problem here Jun 11 10:21:52 suppose you invalidate a part of the tree (a subtree) Jun 11 10:21:59 pbor: i keep number of line marks in nodes; and leaf nodes contain highlighting stuff and the marks Jun 11 10:22:06 a subtree represent a part of the buffer Jun 11 10:22:13 paolo: tags are not a problem, the problem is having offsets in the tree nodes Jun 11 10:22:21 you have to remove the tags from that part of the buffer Jun 11 10:22:34 paolo: but you can't do it until you have new tags Jun 11 10:22:48 muntyan: why not? Jun 11 10:23:02 because there is nothing to delete until then Jun 11 10:23:07 pbor: since we need to use relative offset Jun 11 10:23:12 paolo: yes, but how do you map a subtree to an area of the buffer? Jun 11 10:23:13 if you delete text, there is no region to delete tags from Jun 11 10:23:13 pbor: not absolute ones Jun 11 10:23:22 you need to analyze it first Jun 11 10:23:34 or if you insert text, same thing Jun 11 10:24:02 you can mark position as "invalid", but you don't know in advance what *area* really changed Jun 11 10:24:42 hmm... it is difficult to explain without figures Jun 11 10:25:09 let me try step by step Jun 11 10:25:16 don't think to the current implementation Jun 11 10:25:23 hm, actually i don't see why what i'm saying makes sense Jun 11 10:25:38 suppose you have a syntax tree Jun 11 10:25:57 like the one in the wikipedia figure Jun 11 10:27:41 suppose you add a piece of text between "the" and "ball", then you invalidate a part of the tree Jun 11 10:28:12 if the text does not create a new context, then you only need to add a new node (without invalidating other nodes) Jun 11 10:28:31 otherwise you will have to invalidate part (or all) the tree Jun 11 10:29:34 let us speak about text offset now Jun 11 10:29:57 ATM we use absolute offset Jun 11 10:30:04 i'll be back in ten minutes, son is hungry Jun 11 10:30:36 so "Jonh" is [0-4], "hit" is [5-8]" and so on Jun 11 10:30:55 I think we should use relative offset Jun 11 10:31:08 where relative is according to the "father" node Jun 11 10:31:46 in this way it will be easier to make changes without visiting all the tree Jun 11 10:32:10 and the queue will contain "nodes" and not text segments Jun 11 10:32:16 what do you think? Jun 11 10:32:34 using relative offsets may be better for the tree merging, but I don't see how it helps us: Jun 11 10:32:57 the fact is that the real analisys happens async, in the idle Jun 11 10:33:25 so a second modification can happen before you analyzed the first one Jun 11 10:33:47 wait Jun 11 10:33:51 and you can't put the nodes in queue, since it means analyzing right away Jun 11 10:34:19 we can split analysis in a sync part (i.e. updating or invalidating the tree) Jun 11 10:34:45 and a second aync part "rebuilding" the invalidate part of the tree Jun 11 10:35:48 i,e. always having a fast analysis computing only the "first" level of the subtree Jun 11 10:36:04 well, that depends on how fast the tree invalidation is... Jun 11 10:36:37 it only works on the tree, so I think it should be fast enough Jun 11 10:37:26 it may also have other problems for performance (though here is all moot without numbers) Jun 11 10:38:05 for instance a 'paste' inserts many chuncks of text (one for each tag toggle) Jun 11 10:38:26 so you would do many invalidations before reanalizying Jun 11 10:38:45 how do you invalidate part between contexts? Jun 11 10:38:59 e.g. top-level no-contexts text Jun 11 10:39:14 2) merging is still hard Jun 11 10:39:29 merging will be always hard :) Jun 11 10:39:43 not if you don't keep segments in the tree Jun 11 10:39:49 invalidating part between contexts means adding a new context Jun 11 10:40:03 and updating the offset of the following siblings Jun 11 10:40:30 it is the easiest update Jun 11 10:41:41 what I mean is that syntax trees are historically the most important data structure in language analysis Jun 11 10:41:57 and I don't think we have to invent new structures Jun 11 10:42:19 well, let me play angry: Jun 11 10:42:57 i have working highlighting without any knowledge of history; and there is broken engine which didn't invent new data structures Jun 11 10:42:57 What about trying to look at eclipse code and at colorer to see what they are using?) Jun 11 10:43:03 * muntyan stops playing angry Jun 11 10:43:22 hehe :) Jun 11 10:43:30 what i want to say is that we need something that will work and is good. "history" is zero weight argument here, imho Jun 11 10:44:37 sheesh, color tarball doesn't have single toplevel dir! Jun 11 10:44:41 colorer Jun 11 10:44:45 good it's in /tmp Jun 11 10:44:45 well, history is always important... if you want call it a "design pattern" :) Jun 11 10:45:03 i don't give a shit to that, it means too little these days Jun 11 10:45:11 everybody uses "design patterns" ;) Jun 11 10:45:35 paolo: well, the fact that a tree can be used to represent a syntax is no news, but the real question is what do we need the tree for? Jun 11 10:45:40 anyway, if a tree is good, it's good. but we need to know for sure what we choose, and why Jun 11 10:46:43 I mean, the engine is keeping a tree of the whole buffer syntax up to date, but as far as I can see it's not really using it Jun 11 10:47:22 for highlighting it just wants to know which tag apply to each segment Jun 11 10:47:38 and to do this you have to walk the whole segments Jun 11 10:47:57 be it be going trough a list, walking a tree or whatever Jun 11 10:49:00 i personally worry about modifications, they have proven to be the hard part ;) Jun 11 10:49:27 maybe the whole syntax tree can be very useful for stuff like folding, or for indenters... but I am missing the compelling reason to have a full tree representation of the syntax of the text in the buffer Jun 11 10:50:04 to avoid to re-analyze part of the buffer during modifications Jun 11 10:50:18 and put the basis for further analysis Jun 11 10:51:22 but for hl I just want to iter through the text and know at which point of the syntax 'stack' I am in Jun 11 10:52:08 that is, I just need to know the parent context, not all the ancestors Jun 11 10:52:31 (except for the #pop#pop case muntyan said before) Jun 11 10:52:32 in either case we have a tree, just maybe implicitly; question is whether we really need a big tree with number of nodes equal number of syntax regions in the buffer Jun 11 10:53:00 pbor: we do need all the ancestors; but once we have parent, we have its parent, and so on Jun 11 10:53:29 muntyan: yes, sure Jun 11 10:54:21 anyway, I think that invalidating the tree right away could solve the issues with the queue Jun 11 10:54:38 what has to be seen is if we can do that fast enough Jun 11 10:55:28 should be fast Jun 11 10:58:08 as far as I can see that is also orthogonal to the relative offset issue Jun 11 10:59:10 what do you mean? Jun 11 10:59:45 that you can invalidate a part of a tree even with absolute offsets Jun 11 10:59:49 can't you? Jun 11 11:00:36 yep, they are relative problems Jun 11 11:00:44 I mean ortogonal Jun 11 11:00:54 but if offsets are absoulte, you need to walk all the nodes? Jun 11 11:01:17 using relative offset enable in some cases very fast changes without the need to walking all the subtree Jun 11 11:01:35 relative offsets rule Jun 11 11:01:41 absolute offsets suck Jun 11 11:01:43 :) Jun 11 11:02:47 yes, we all agree that relative offsets are better Jun 11 11:03:01 I'd say: can we try to write down the "modification" algorithm in pseudo code (so we can also use it for documentation)? Jun 11 11:04:25 I think a modification can only have 3 effects: Jun 11 11:04:31 - adding a context Jun 11 11:05:15 - invalidating a subtree of contexts Jun 11 11:05:50 well they are only 2 Jun 11 11:06:33 actually we could reduce "adding a context" as a sub-case of "invalidating a subtree" Jun 11 11:06:34 comments in colorer look like it reparses everything after modification Jun 11 11:07:09 pbor: do you still have the thesis of barisione? Jun 11 11:07:15 paolo: yes Jun 11 11:07:21 what does "invalidating subtree" mean? invalidating whole subtree and reanalyzing it? Jun 11 11:07:28 are therec omments con colorer? Jun 11 11:08:28 it means removing it from the context tree and mark the linked buffer segment as "to-be-analyzed" Jun 11 11:09:13 paolo: not really... just a small paragraph where it says why implementing sourceview instead of using colorer Jun 11 11:09:29 well, no, it means: Jun 11 11:09:37 - removing the subtree Jun 11 11:09:41 paolo: and the arguments are: overcomplex, C++, ugly lang files Jun 11 11:09:59 i am thinking about inserting text at the top level Jun 11 11:10:01 - leaving the root (marked as "to-be-analyzed") Jun 11 11:10:18 you don't want to remove anything, i.e. adding a node is not invalidating a subtree Jun 11 11:10:50 http://rafb.net/paste/results/bXZKT725.html Jun 11 11:10:55 ok, I agree Jun 11 11:11:12 that's from colorer header, looks rather nasty Jun 11 11:11:36 ok Jun 11 11:11:53 anyway, who is colorer? Jun 11 11:11:59 does it work in any decent editor? Jun 11 11:12:12 see the homepage :) Jun 11 11:12:24 I have never tried the cited editors Jun 11 11:12:31 eclipse is shit, mc doesn't have live updating on modifications Jun 11 11:12:38 dunno about far Jun 11 11:13:44 eclipse is great :) Jun 11 11:13:51 but it is not using colorer Jun 11 11:15:00 eclipse is great? you have strong nerves then :) Jun 11 11:15:08 or using java Jun 11 11:15:52 using java Jun 11 11:16:04 it must be great then :) Jun 11 11:18:14 eclipse is a great IDE for Java... but it sucks for C/C++ Jun 11 11:21:50 so, to sum things up, do we have a plan? Jun 11 11:24:04 muntyan: how much effort is to 'fix' the queue for the 'debug thing'? if it's not much maybe it would be nice to fix it just to see if something else bites us before touching the context tree... Jun 11 11:25:25 I think the plan is: Jun 11 11:25:31 lot of effort Jun 11 11:25:59 1. writing the modification algorithm in pseudo-code Jun 11 11:26:10 2. validating it by hand Jun 11 11:26:39 3. modifying the current context tree according to 1 (using relative offsets) Jun 11 11:26:45 do you agree? Jun 11 11:26:52 3. is really really hard Jun 11 11:27:06 why? Jun 11 11:27:07 for me it besically means rewriting everything Jun 11 11:27:22 it could be Jun 11 11:27:56 trying to preserve most code is possible Jun 11 11:28:19 or cutting and pasting in the new code most old-code is possible Jun 11 11:28:38 I'm thinking to the code performing parsing Jun 11 11:28:51 one needs to understand the code first ;) Jun 11 11:28:55 do you guys agree in my action plans? Jun 11 11:29:07 honestly, i don't like it Jun 11 11:29:14 why? Jun 11 11:29:18 we did not discuss all the issues Jun 11 11:29:31 let us discuss them, then Jun 11 11:29:42 and this plan is a big plan, it's not something "for now to see what happens" Jun 11 11:30:09 how to merge old and new trees? Jun 11 11:30:24 do we want to give up after N characters and reanalyze reminder of the buffer? Jun 11 11:31:45 well, that is something that need measuring Jun 11 11:32:01 it really depends on how much slower is try to merge the old tree Jun 11 11:32:20 it's important Jun 11 11:32:25 well, this is what point 1 needs to Jun 11 11:32:35