This entry is just me wondering about paths one takes to really understand a concept. On example of interning Strings.
Few days ago I asked one guy – freshly out of college but with 2,5 years of Java experience – two Java puzzles that I ask people occasionally. Only few people answered one well, he nearly got both. The puzzles were about Strings and comparisons in Java, and had quite a few quirks in the subject.
And this led me to thinking – where did I learn about that?
Turns out, thanks to Ruby.
Why not with Java? That’s peculiar, but when I first heard (and later read) about String immutability it was presented as a botched idea. Like, yeah, String is immutable but they had to add StringBuffer and StringBuilder.
So, I accepted that Java makers made a botched decision and never really gave it much thought.
Until I was writing Ruby and I came across symbols. For those who do not know how they look:
:'a stupid and lengthy name I don\'t want to store in memory repeatedly'
All above are symbols. Needless to say, I paused. I reread the example I found, with such oddities, and wondered, what the hell is this thing preceded by a colon? Is it a String? Well, it was and it wasn’t. An Object? Certainly. Everything is an Object in Ruby even more so than in Java (primitives, anyone?).
So I turned to Internet. I found two very good links that explained me what Symbols are for.
- Steve Litt’s explanation was very step by step and light to swallow. Though it dodges few questions – because of who it is aimed at.
- Kevin Clark’s blog post on the same, with few numbers and practical usages.
Both links gave me some understanding, that I quite recalled even today:
irb(main):002:0> puts :LAFK LAFK => nil irb(main):003:0> puts :LAFK.to_i 16645 => nil irb(main):004:0> puts :LAFK.to_s LAFK => nil irb(main):005:0> puts :LAFK.type (irb):5: warning: Object#type is deprecated; use Object#class Symbol => nil irb(main):006:0> puts :LAFK.class Symbol => nil
Symbols are, in a way, immutable Strings. Strings that will not change, and therefore, should not clutter the memory with repeated occurrences. They serve as great hashcodes (to_i works!) and they won’t allow assignments.
So when I again heard about the immutable Strings topic I had some deeper knowledge, but while Symbols had the immutability down to no assignment works, this is not true for Java Strings. Interned Strings are one, but calling
new String("long name that I don't want to store in memory repeatedly") works around that.
So, there was more to it than I thought. I started looking around and certainly, there was. Beside:
- Immutable = can be cached = good for hashcode
- Immutable = comparing needs not be done char by char!
- String interning – good for memory
There also were few other reasons:
- Obvious result of String being final: lots of Java API wanted immutable params and rely on this now. Network or DB connection or file writers accept String parameters.
- Strings are used in classloader so we don’t want them changed in the middle.
This article nicely summarizes it.
Of course there is a lot more to String interning, but what I wanted to actually convey is something else.
Be a polyglot programmer
It’s fun. It broads your horizons. Since our brains work quite often by analogy, seeing same problems being tackled differently helps understand them better. So, even if your primary language is something you like and feel comfortable with, stray off “the right path” once in a while.
I learned quite a lot about String interning in general and in Java thanks to Ruby symbols.
Similarly, I understood better the pass-by-value in Java thanks to C pointers.
It pays to know other languages. Or, as Bach (not the composer but the tester) would say: knowledge attracts knowledge.