sql - Comma-delimited fields in a csv file in plsql -
i have
while instr (l_buffer, ',', 1, l_col_no) != 0
which checks whether l_buffer comma delimited , enters loop.
now have file values
candidatenumber,rnumber,title,organizationcode,organizationname,jobcode,jobname 10223,1600003b,admin officer,00000004,"org land, inc.",orga03,orga03 hr & admin
in file considering "org land, inc." 2 words because of ,
in between. there way treat 1 using instr
or anything?
horrible idea. if forced use character-delimited strings, least should able require delimiter character guaranteed not appear in regular field values.
the problem raised can solved. show below solution - not close efficient, @ least shouldn't difficult follow logic. intentionally chose example (the fifth string) demonstrate how can fail. assumed commas between pair of double-quotes (an opening 1 , closing one) should become "invisible" - treated if not delimiters, part of field value. breaks if double-quote used in way different "usual" - see sample string #5. break on other "natural" uses of comma (where not meant delimiter) - example, if have field value of $1,000.00? need "escape" comma too. 1 come @ least ten more similar situations - going code around of them?
now, own learning , practice, pretended way comma may need "escaped" (to become invisible tokenization process) if enclosed between opening , closing double-quote (determined ordering: double-quote odd count beginning of string opening one, , double-quote count closing one). here solution; test strings @ top, including few test proper treatment of nulls, , output following after.
good luck!
with test_strings (r, s) ( select 1, 'abdc, ronfn 0003, "abc, inc.", 9939' dual union select 2, 'new delhi' dual union select 3, null dual union select 4, ',' dual union select 5, 'if needed, use double quote("), ok?' dual ), t (r, s) ( select r, ',' || s || ',' test_strings ), ct (r, nc, nq) ( select r, regexp_count(s, ','), regexp_count(s, '"') t ), c (r, pos) ( select t.r, instr(t.s, ',', 1, level) t join ct on t.r = ct.r connect level <= ct.nc , t.r = prior t.r , prior sys_guid() not null ), q (r, pos) ( select t.r, instr(t.s, '"', 1, level) t join ct on t.r = ct.r connect level <= ct.nq , t.r = prior t.r , prior sys_guid() not null ), p (r, pos_from, pos_to, rn) ( select r, pos, lead(pos) on (partition r order pos), row_number() on (partition r order pos) c mod((select count(1) q q.r = c.r , q.pos != 0 , q.pos < c.pos), 2) = 0 ) select p.r string_number, p.rn token_number, substr(t.s, p.pos_from + 1, p.pos_to - p.pos_from - 1) t join p on t.r = p.r p.pos_to not null order string_number, token_number ;
results:
string_number token_number token ------------- ------------ -------------------- 1 1 abdc 1 2 ronfn 0003 1 3 "abc, inc." 1 4 9939 2 1 new delhi 3 1 4 1 4 2 5 1 if needed 9 rows selected.
Comments
Post a Comment