(no title)
lor_louis | 9 months ago
The core data structure (array of lines) just isn't that well suited to more complex operations.
Anyway here's what I built: https://github.com/lorlouis/cedit
If I were to do it again I'd use a piece table[1]. The VS code folks wrote a fantastic blog post about it some time ago[2].
[1] https://en.m.wikipedia.org/wiki/Piece_table [2] https://code.visualstudio.com/blogs/2018/03/23/text-buffer-r...
vidarh|9 months ago
It does become a problem if you insist on trying to open files of hundred of MB of text, but my thinking is that I simply don't care to treat that as a text editing problem for my main editor, because files that size are usually something I only ever care to view or is better off manipulating with code.
If you want to be able to open and manipulate huge files, you're right, and then an editor using these kind of simple methods isn't for you. That's fine.
As it stands now, my editor holds every file I've ever opened and not explicitly closed in the last 8 years in memory constantly (currently, 5420 buffers; the buffer storage is persisted to disk every minute or so, so if I reboot and open the same file, any unsaved changes are still there unless I explicitly reload), and it's not even breaking the top 50 or so of memory use on my machine usually (those are all browser tabs...)
I'm not suggesting people shouldn't use "fancier" data structures when warranted. It's great some editors can handle huge files. Just that very naive approaches will work fine for a whole lot of use cases.
E.g. the 5420 open buffers in my editor currently are there because even the naive approach of never garbage collecting open buffers just hasn't become an issue yet - my available RAM has increased far faster than the size of the buffer storage so adding a mechanism for culling them just hasn't become a priority.
lor_louis|9 months ago
Regex searches and code highlight might introduce some hitches due to all of the seeking.
pmontra|9 months ago
userbinator|9 months ago
Modern CPUs can read and write memory at dozens of gigabytes per second.
Even when CPUs were 3 orders of magnitude slower, text editors using a single array were widely used. Unless you introduce some accidentally-quadratic or worse algorithm in your operations, I don't think complex datastructures are necessary in this application.
lifthrasiir|9 months ago
lelanthran|9 months ago
Just how big (and how many lines) does your file have to be before it is a problem? And what are the complex operations that make it a problem?
(Not being argumentative - I'd really like to know!)
On my own text editor (to which I lost the sources way back in 2004) I used an array of bytes, had syntax highlighting (Used single-byte start-stop codes for syntax highlighting) and used a moving "window" into the array for rendering. I never saw a latency problem back then on a Pentium Pro, even with files as large as 20MB.
I am skeptical of the piece table as used in VS Code being that much faster; right now on my 2011 desktop, a VS Code with no extra plugins has visible latency when scrolling by holding down the up/down arrow keys and a really high keyboard repeat setting. Same computer, same keyboard repeat and same file using Vim in a standard xterm/uxterm has visibly better scrolling; takes half as much time to get to the end of the file (about 10k lines).
ofalkaed|9 months ago
I think vim uses a gap structure, not a single array but don't remember.
I am not a programmer, my experience could very well be due to failings elsewhere in my code and my reasoning could be hopelessly flawed, hopefully someone will correct me if I am wrong. It has also been awhile since I dug into this, the project which got me to dig into this is one of the things which got me to finally make an account on hn and one of my first submissions was Data Structures for Text Sequences.
https://www.cs.unm.edu/~crowley/papers/sds.pdf
shpx|9 months ago
https://github.com/antirez/kilo/blob/323d93b29bd89a2cb446de9...