About strings, chars, bytes, and memory

I’ve been reading, and watching stuff about programming, and several sources, seems to suggest that as a programmer, I should be blogging. Not because I should expect people to read it, but because of the exercise in writing it.
While it might seems otherwise below, that’s really the whole point of this post: Me writing it.

The bug

Recently, I had to write a PLI program, that received a string, as input, and send a string further on as output, for the string to eventually end up in somebody’s inbox.  Yes, it was a simple program, that other programs could use to send a mail.

One of my initial tests showed that when I received this string:

This is a test

What ended up in the inbox was this string:

♫This is a test

My first thought, went of to “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)“.

Since it’s a PLI mainframe program, strings are EBCDIC by default. It’s a CICS program, and CICS do some conversions automatically, and changing this is a system-wide thing, and not something for me to change. Besides, if that was actually the problem, certainly, someone would have noticed before I did.

As it turned out this was actually because of a combination of how PLI stores strings, and of course, a thing, I did. This lead me to the think, that as programmer, it’s quite useful to also know how your language of choice stores strings.

Læs videre About strings, chars, bytes, and memory

Teknisk arkitekt og alsidig udvikler