-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Strings (multiple chars) as 'quotechars' #117
Conversation
The test which is failing is case = (str = " {\\}} ", kwargs = (), x = 0, code = (QUOTED | INVALID | ESCAPED_STRING | EOF), vpos = 3, vlen = 2, tlen = 6) oq = "{{"
cq = "}}" (so after the replacement: where we're setting the wrong code, as we're setting adding More generally, i think this is parsing one characted too many e.g. julia> str = "#= text here =#";
julia> res = Parsers.xparse(String, str; openquotechar="#=", closequotechar="=#", escapechar=0x22, stripquoted=true);
julia> Parsers.getstring(str, res.val, 0x22)
"text here =" EDIT: Solved! |
ac9e26b
to
345007a
Compare
Codecov Report
@@ Coverage Diff @@
## main #117 +/- ##
==========================================
+ Coverage 87.65% 87.83% +0.17%
==========================================
Files 9 9
Lines 2309 2351 +42
==========================================
+ Hits 2024 2065 +41
- Misses 285 286 +1
Continue to review full report at Codecov.
|
okay, tests are passing -- so this could be ready for review (now for me to go figure out if this is actually gonna help me over at nickrobinson251/PowerFlowData.jl#27 ...) Are there any benchmarks we should run when we add things to the Also this changes And this makes the keyword names (also cc @BSnelling, since we chatted a bit about this) |
end | ||
|
||
# Used by CSV.jl | ||
function checkdelim!(buf::AbstractVector{UInt8}, pos, len, options::Options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added comment and type annotation here because i ended up being led astray by this function... and then wondered why we had it at all 😂
@@ -216,7 +223,11 @@ for useio in (false, true) | |||
@test x == case.x | |||
end | |||
@test code == case.code | |||
@test tlen == case.tlen | |||
if Parsers.quoted(code) | |||
@test tlen == ncodeunits(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i should probably add a comment here:
- we don't check
case.tlen
for quoted input, because our test code replaces the quoted chars in the input with (poentially) multiple characters - but also for this to be the right check, the testcases have to be limited to inputs without characters after a delimiter, which they are right now, but we might add such testcases in the future so we should probably make the assumption explicit
@@ -96,15 +96,20 @@ function Options( | |||
dateformat::Union{Nothing, String, Dates.DateFormat, Format}, | |||
ignorerepeated, ignoreemptylines, comment, quoted, debug, stripwhitespace=false, stripquoted=false) | |||
asciival(wh1) && asciival(wh2) || throw(ArgumentError("whitespace characters must be ASCII")) | |||
asciival(oq) && asciival(cq) && asciival(e) || throw(ArgumentError("openquotechar, closequotechar, and escapechar must be ASCII characters")) | |||
(oq isa String || asciival(oq)) && (cq isa String || asciival(cq)) && asciival(e) || throw(ArgumentError("openquotechar, closequotechar, and escapechar must be ASCII characters")) | |||
(oq == delim) || (cq == delim) || (e == delim) && throw(ArgumentError("delim argument must be different than openquotechar, closequotechar, and escapechar arguments")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to think about how delim
and oq
/cq
interact e.g. when delim::Char in oq::String
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably just check the delim
and oq
/cq
(i) are not equal, (ii) none of the chars in one are in the other
#= text here =#"
to"text here"