SPLITSTR and NULL \0

Previous topic - Next topic

Moru

I'm having troubles splitting strings on NULL, CHR$(0). I always end up with only the first part of the string.

Code (glbasic) Select
LOCAL a$, count, array$[]

a$ = "one"+CHR$(0)+"two"+CHR$(0)+"three"

count = SPLITSTR(a$, array$[], CHR$(0), FALSE)

DEBUG "fields: " + count + "\n"

FOREACH b$ IN array$[]
DEBUG " ["+b$ + "]\n"
NEXT

END


Result:
fields: 1
[one]

Expected:
fields: 3
[one]
[two]
[three]


Moebius

I think you might have null-terminated C strings to blame for that.

It seems that the strings simply get cut off at the first null character, even when nothing is used to split the string:
Code (glbasic) Select
LOCAL Array$[]
SPLITSTR("string\0 is cut off",Array$[],"")
PRINT LEN(Array$[]), 10, 10
PRINT Array$[0], 10, 40
SHOWSCREEN
KEYWAIT
END


This displays "1" and "string"...
Endless Loop: n., see Loop, Endless.
Loop, Endless: n., see Endless Loop.
- Random Shack Data Processing Dictionary

Moru

Yes it's null-terminated C strings fault probably. But this is basic, this should work :-) Especially since Gernot has showed this as the solution to get around things in other places, SPLITSTR with NULL to split a string you get over the network.

Moebius

It should work ideally.  We'll just have to wait for Gernot to see if it's possible...
A lot of other commands seem to be limited by this as well (INSTR,  REPLACE, LEN, PRINT, and it seems just about everything that deals with strings...)
In fact, the same problem seems to occur using networking commands:
Code (glbasic) Select
SOCK_INIT()
sock% = SOCK_UDPOPEN(123456)
sock2% = SOCK_UDPOPEN(123457)
SOCK_UDPSEND(sock,"hello\0world",0x7F000001,123457)
SLEEP 1000
PRINT SOCK_RECV(sock2,msg$,1000), 10, 10
PRINT msg$, 10, 40
SHOWSCREEN
KEYWAIT
END


SOCK_RECV returns 5, indicating that 5 bytes were received (aka the string was cut off after hello).

I tried before to send values across the network as bytes in strings, but any null character seems to cause issues...  and it would take a heck of a lot of rewriting to get around all of these null-termination problems.
Endless Loop: n., see Loop, Endless.
Loop, Endless: n., see Endless Loop.
- Random Shack Data Processing Dictionary

Moru

Networking works fine with null but you have to write it like this:
SOCK_UDPSEND(sock,"hello"+CHR$(0)+"world",0x7F000001,123457)

Just don't expect to be able to print a string with null in it :-)

Moebius

I was going off the received length to see if the string made it...  and sure enough using CHR$(0) instead of "\0" works...  and I don't understand why...
Okay I stand corrected on a lot of things where CHR$(0) works but "\0" doesn't (??)
You're right though - SPLITSTR is an exception...
Endless Loop: n., see Loop, Endless.
Loop, Endless: n., see Endless Loop.
- Random Shack Data Processing Dictionary

Moru

I'm guessing the \0 makes troubles for strings since glbasic compiles to c. chr$(0) is just a function returning a char so is not stopped by such problems.

Kitty Hello

ok:

C++ code ahead, easy, though.

[code
const char* pStr = "Test\0Test";

DGStr mystring_Str = pStr;
   // step 1, find length
   int len = strlen(pStr)
       // will find first '\0' character and return the length
[/code]
-> len = 4
The char pointer has _no_ idea, how long the string is.


Code (glbasic) Select

DGStr s = DGStr("test") + CHR_Str(0) + DGStr("test");


functions called:
2x constructor DGStr, 1xchr$ -> returning a string object with length 1 byte (0 value).
Then 2x the + operator, concatenating the strings.

So, the 0 character can only be concatenated with CHR$, or READSTR. But a GLBasic string can safely contain 0 charcters for all opterations.

Moru

Quote from: Kitty Hello on 2011-Mar-28
So, the 0 character can only be concatenated with CHR$, or READSTR. But a GLBasic string can safely contain 0 charcters for all opterations.

... Except with splitstr() as the first post was about :-)

Kitty Hello

You're right. Must I change that?

Moru

Not for me, just pointed out that it doesn't work as adverticed.