Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More granular options parsing for Environment #640

Merged
merged 10 commits into from
Oct 17, 2017

Conversation

briadam
Copy link
Contributor

@briadam briadam commented Sep 25, 2017

I'm seeking preliminary feedback on this before forging ahead.

Tests break when Boost Program Options are enabled, but they seem to on dev as well. Guidance?
FAIL: test_BoostInputOptionsParser
FAIL: t01_valid_cycle/rtest01.sh
(HANGS or SLOW): test_mala

I'm going to mark up the code myself with questions. Also these general ones:

  • Should I further condense code shared between FullEnvironment::construct(...) variants?
  • Do we anticipate a need for options_have_been_used checking as in GPMSA?

Copy link
Contributor Author

@briadam briadam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Questions inlined for @dmcdougall and @roystgnr ...

else {
// BMA 20170925: Changed this to default construct the options when none
// are passed in, so downstream checks for != NULL may behave differently
m_optionsObj.reset(new EnvOptionsValues());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now default constructing EnvOptionsValues if not passed in. Is that a problem? Should I remove checks such as:
if (m_optionsObj == NULL) return ".";
queso_require_msg(m_optionsObj, "m_optionsObj variable is NULL");

What about for EmptyEnvironment? Is it important that it have options = NULL as opposed to just default constructed options? The only use case is for member BaseTKGroup::m_emptyEnv which doesn't appear to get used anywhere...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going with "file options override constructor-time C++ options, right? In that case default-constructing EnvOptionsValues seems like the right thing to do; it gives us a copy of the default options, unless they've been overridden in C++, and either way they're ready for the "input files can override us still further" code path.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return "."; does seem redundant now, since that's the default value for EnvOptionsValues::m_subDisplayFileName.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd leave the queso_require in there. Since it's not in user hands anymore we could turn it into a queso_assert if you like, but it's also not in an inner loop anymore so there's no compelling reason to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And there seems no reason not to give EmptyEnvironment default options, to make it less useless.

void set_defaults();

//! Given prefix, read the input file for parameters named "prefix"+*
void parse(const BaseEnvironment* env, const char* prefix);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept pointers such as const char* and Env* for consistency with ctor instead of changing APIs to use std::string and Env&

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about consistency in a brand-new API, and I wouldn't worry too much about stylistic consistency in QUESO.

Using a reference is IMHO always better than a pointer to a class when you expect the pointer to never be null.

Using a char pointer rather than converting to std::string is often a nice way to buy a bit of efficiency at the expense of a bit of safety, but the former is negligible in a method that'll only be called a handful of times at most per execution, so I'd prefer the latter.

#ifndef QUESO_DISABLE_BOOST_PROGRAM_OPTIONS
m_parser(),
, m_parser(new BoostInputOptionsParser())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we default construct a BoostInputOptionsParser for the empty input file case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, isn't the new BoostInputOptionsParser() bit just in your patch? The previous code was just m_parser(), leaving it an uninitialized scoped pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. I caused confusion here by not making reference and relaying a question I was more asking myself. I took a cue from

m_parser(new BoostInputOptionsParser()),
where it gets initialized, but I agree leaving the scoped ptr uninitialized should be okay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that looks like a mistake on my part. If we need m_parser then we reset it anyway, so there's no point in initializing it in the first place...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roystgnr FWIW, I learned the hard way why this initialization is necessary: operator<< dereferences this object, which would fail after default construction. I added initialization to all classes for now and will promptly remove it post 0.58.0. Same bug could affect copy ctors, but that use case isn't exercised by any code right now, so ignoring.

m_seed = UQ_ENV_SEED_ODV;
m_platformName = UQ_ENV_PLATFORM_NAME_ODV;
m_identifyingString = UQ_ENV_IDENTIFYING_STRING_ODV;
m_numDebugParams = UQ_ENV_NUM_DEBUG_PARAMS_ODV;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why aren't debug parameters parsed from the input file? Should I add them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These parameters appear to be an atavism that haven't actually been used for anything:

  • since 2010, at the latest
  • in the version control master, rather than a branch
  • in git, rather than svn

So our second best option here would be "dig through ancient svn branches to figure out what these are for", and our best would be "just get rid of them".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if @dmcdougall agrees, I can purge them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with this.

"Allow all inter0 nodes to write to output file");

// convert the current member set of integers to a string for default handling
std::ostringstream sdas_as_string;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm converting the set back to string for default handling. Is the trailing whitespace a problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm pretty sure it is. Whitespace in the input files is fine with either boost or getpot, but I'm not sure either trims whitespace from input API parameters, and if they do it's surely luck rather than officially supported behavior.

{
m_env = env;

if (m_env->optionsInputFileName().empty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should parse be robust to the case of an empty input file string? For now, I'm leaving the error in place. Seems calling parse means you really want to parse, but I could also see just silently skipping the parse if nothing to do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, keep the error. Belt-and-suspenders.

m_optionsObj.reset(new EnvOptionsValues(this, prefix));
}
if (!m_optionsInputFileName.empty())
readOptionsInputFile(prefix);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a semantic change, not just an improved capability, so let's be wary of regressions. I can't imagine anyone was actually using the behavior of "pass an input file name and a non-null option object pointer in and expect the former to be ignored", but I've seen stranger code in the past.

@@ -1491,26 +1484,27 @@ FullEnvironment::construct (RawType_MPI_Comm inputComm,
FullEnvironment::FullEnvironment(
const char* passedOptionsInputFileName,
const char* prefix,
EnvOptionsValues* alternativeOptionsValues)
EnvOptionsValues* initialOptionsValues)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the name changes. "Self-documenting code" rather than "atavistic deceptive code" is a big plus.

// * ELSE use default options
// New behavior:
// * START with default options or passed alternative options
// * IF input file, any parsed options override stored values
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd put this comment in the header, where many people will read it as they use the API, rather than in the source file, where nobody but core developers will look unless they're already tearing their hair out over failure to understand the behavior.

@roystgnr
Copy link
Member

Tests break when Boost Program Options are enabled, but they seem to on dev as well. Guidance?
FAIL: test_BoostInputOptionsParser
FAIL: t01_valid_cycle/rtest01.sh
(HANGS or SLOW): test_mala

I just discovered that my gpmsa_new_functional branch doesn't even link when boost::program_options is enabled, so I'm merely embarrassed rather than surprised that dev isn't getting much love for that configuration either. I could swear I had it working in January...

@roystgnr
Copy link
Member

test_mala finished quickly for me, but the log file is filled with NaN warnings.

@roystgnr
Copy link
Member

The other two failed.

@roystgnr
Copy link
Member

  what():  the argument (''false'') for option 'bool' is invalid. Valid choices are 'on|off', 'yes|no', '1|0' and 'true|false'

Coincidentally, "what" was my reaction as well.

@roystgnr
Copy link
Member

roystgnr commented Sep 26, 2017

I saw Goody @dmcdougall dancing with the devil!

GetPot appears to be much more flexible when in how it deals with booleans and a bit trickier with strings and non-scalar options - it strips enclosing quotes if they're there, while boost::program_options gets confused by quotes around booleans and retains quotes around strings and non-scalar values.

So I think we'd actually be fine if we just used boost-compatible files for everything (and reverting the single quotes added to test01_valid_cycle/tgaCycle.inp backs me up on this)... which makes me wonder why exactly Damon went to all the trouble of changing them in the first place.

@roystgnr
Copy link
Member

Should I further condense code shared between FullEnvironment::construct(...) variants?

On the one hand, I'm a big fan of refactoring that reduces code size, but on the other hand only you know if that's worth the time.

@roystgnr
Copy link
Member

Do we anticipate a need for options_have_been_used checking as in GPMSA?

No, but the whole point of that checking was to handle unanticipated application code mistakes, so it wouldn't hurt to use that idiom more often, if you want.

@briadam
Copy link
Contributor Author

briadam commented Sep 26, 2017

Thanks for all the helpful review comments!
Yeah I see the same as you with t01_valid_cycle ([edit] and test_BoostInputOptionsParser). I still have 1 fail to chase down. I'm going to focus on integrating your review first and then if you haven't picked this issue back up, can revisit.

@briadam
Copy link
Contributor Author

briadam commented Sep 27, 2017

So I think we'd actually be fine if we just used boost-compatible files for everything (and reverting the single quotes added to test01_valid_cycle/tgaCycle.inp backs me up on this)... which makes me wonder why exactly Damon went to all the trouble of changing them in the first place.

I made the rest of Roy's requested changes. Should I make a commit on this branch (or dev) to reverse the input file changes from #457 ? Doing so seems to fix most tests with BPO enabled, though not test_intercomm0, so there might be a real regression with my changes.

briadam added a commit to briadam/queso that referenced this pull request Sep 27, 2017
Partially addresses libqueso#569 by addressing feedback in PR
libqueso#640, specifically:

 * Keep comments in header
 * Prefer reference to pointer
 * Cleanup m_optionsObj tests for NULL (adding require !NULL in a couple places)
 * Only initialize BoostInputOptionsParser when needed
 * Take care not to have trailing whitespace when converting set of
   integer to string

Some tests still fail with Boost program options enabled.
@codecov
Copy link

codecov bot commented Sep 27, 2017

Codecov Report

Merging #640 into dev will increase coverage by 0.24%.
The diff coverage is 89.23%.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev     #640      +/-   ##
==========================================
+ Coverage   75.78%   76.03%   +0.24%     
==========================================
  Files         318      318              
  Lines       24267    24699     +432     
==========================================
+ Hits        18390    18779     +389     
- Misses       5877     5920      +43
Impacted Files Coverage Δ
...c/core/src/InfiniteDimensionalMCMCSamplerOptions.C 1.78% <0%> (-1.67%) ⬇️
src/stats/src/StatisticalInverseProblemOptions.C 80% <100%> (+11.7%) ⬆️
src/core/src/OptimizerOptions.C 85.29% <100%> (+10.93%) ⬆️
src/misc/src/Miscellaneous.C 79.49% <12.5%> (-2.4%) ⬇️
src/stats/src/MLSamplingOptions.C 69.01% <85.71%> (+12.6%) ⬆️
src/core/src/EnvironmentOptions.C 91.22% <92.18%> (-3.06%) ⬇️
src/stats/src/MLSamplingLevelOptions.C 93.08% <93.36%> (+2.89%) ⬆️
src/stats/src/StatisticalForwardProblemOptions.C 74.24% <94.44%> (+11.45%) ⬆️
src/core/src/Environment.C 69.26% <95.23%> (+0.04%) ⬆️
src/stats/src/MetropolisHastingsSGOptions.C 92.23% <95.83%> (+2.84%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5803bf2...4d3fa43. Read the comment docs.

@dmcdougall
Copy link
Member

So I think we'd actually be fine if we just used boost-compatible files for everything (and reverting the single quotes added to test01_valid_cycle/tgaCycle.inp backs me up on this)... which makes me wonder why exactly Damon went to all the trouble of changing them in the first place.

I could've sworn they weren't cross-compatible. That's on me.

@dmcdougall
Copy link
Member

I'm ok with doing the file-change in a different PR.

@dmcdougall dmcdougall added this to the v0.59.0 milestone Sep 28, 2017
@roystgnr
Copy link
Member

Isn't test_intercomm0 the one that we expect to randomly fail, because it doesn't use a fixed PRNG seed? The price of stochastic regression testing...

@roystgnr
Copy link
Member

Yeah, let's do the input format reversion in a separate PR.

@briadam
Copy link
Contributor Author

briadam commented Sep 29, 2017

Well good news, it (test_intercomm0) is behaving as expected then! Issue #642 and PR #644 should resolve that part of this discussion.

@briadam
Copy link
Contributor Author

briadam commented Sep 29, 2017

@dmcdougall I'm working to preserve BPO support in this PR, if that affects which release we target with this. I realize other aspects of this commit do change behavior, but at @roystgnr pointed out, only in isolated ways. I'm thinking to take a second pass to remove BPO support once you draw a release with the first pass options refactor.

@dmcdougall
Copy link
Member

Isn't test_intercomm0 the one that we expect to randomly fail, because it doesn't use a fixed PRNG seed?

No. It should pass every time.

@briadam briadam modified the milestones: v0.59.0, v0.58.0 Oct 12, 2017
@briadam
Copy link
Contributor Author

briadam commented Oct 12, 2017

@dmcdougall @roystgnr I decided to make this change in a way that's largely backward compatible and preserves Boost Program Options parsing. Would prefer this PR make it into the next release as it's needed for effective use of GPMSA scalar capability in Dakota.

@roystgnr
Copy link
Member

Can you fix up the conflicts?

@briadam
Copy link
Contributor Author

briadam commented Oct 12, 2017

Yes, working on it now. The Github docs suggested a process that would mean I make a merge commit on dev, which is surely not right. Should I just merge dev into my options_handling_569 branch and resolve the conflict? Or something else?

@briadam
Copy link
Contributor Author

briadam commented Oct 12, 2017

Well, that'll teach me to try to do the merge with the webtool. I'll do locally this afternoon and fix this up...

@roystgnr
Copy link
Member

I usually "git fetch origin" and "git rebase origin/dev" and then manually fix conflicts as they come up in the rebase. That's sometimes easier said than done, though.

briadam added a commit to briadam/queso that referenced this pull request Oct 12, 2017
Partially addresses libqueso#569 by addressing feedback in PR
libqueso#640, specifically:

 * Keep comments in header
 * Prefer reference to pointer
 * Cleanup m_optionsObj tests for NULL (adding require !NULL in a couple places)
 * Only initialize BoostInputOptionsParser when needed
 * Take care not to have trailing whitespace when converting set of
   integer to string

Some tests still fail with Boost program options enabled.
@briadam
Copy link
Contributor Author

briadam commented Oct 12, 2017

Okay I was just being silly and had messed up that merge. I rebased and force pushed to overwrite the bum merge. I'm hoping this passes checks now.

However, there's a pre-existing issue on dev that prevents building with Boost Program Options: #655 Probably merge this PR and then see if anyone can fix that. I struggled to get bisect working right.

@briadam
Copy link
Contributor Author

briadam commented Oct 15, 2017

Assuming checks pass, this can be merged. I'll note some cleanup needs in #569 and hopefully take one more pass before calling that done.

@briadam
Copy link
Contributor Author

briadam commented Oct 16, 2017

I'm planing to take a cleanup pass today, including adding public default ctors to all the options classes I mucked with.

Partially addresses libqueso#569

Following the lead from GPMSAOptions, refactor EnvOptionsValues to
allow construction from defaults vs. passed options object, with
optional override from parsed user input file.  Finer grained
construct, set_defaults(), parse(), further allow a client to
intersperse setting options via C++ as well since member variables are
public.

This commit changes a couple historical behaviors:

 * If there was an inbound EnvOptionsValues object, the input file
   never got parsed.

 * An Environment could have a NULL m_optionsObj; now we construct one
   with default values.  This could lead to subsequent code cleanup.

Retains optional boost::program_options parser for now.
Partially addresses libqueso#569 by addressing feedback in PR
libqueso#640, specifically:

 * Keep comments in header
 * Prefer reference to pointer
 * Cleanup m_optionsObj tests for NULL (adding require !NULL in a couple places)
 * Only initialize BoostInputOptionsParser when needed
 * Take care not to have trailing whitespace when converting set of
   integer to string

Some tests still fail with Boost program options enabled.
Refactor in support of libqueso#569 to enable various layered
phases of constructing SipOptionsValues.  This mirrors work done in
EnvOptionsValues.

Also minor cleanup in EnvOptionsValues.
Refactor in support of libqueso#569 to enable various layered
phases of constructing MhOptionsValues.  This mirrors work done in
EnvOptionsValues.
Refactor in support of libqueso#569 to enable various layered
phases of constructing SfpOptionsValues.  This mirrors work done in
other *OptionsValues.
Refactor in support of libqueso#569 to enable various layered
phases of constructing McOptionsValues.  This mirrors work done in
other *OptionsValues.
Refactor in support of libqueso#569 to enable various layered
phases of constructing OptimizerOptions, MLSamplingLeveloptions, and
InfiniteDimensionalMCMCSamplerOptions.  This mirrors work done in
other *OptionsValues.
Finish up libqueso#569:
 * Add default ctor to all Options classes and always init defaults
   and m_env
 * Always initialize BoostInputOptionsParser since operator<< may
   dereference it (there remains a bug w.r.t. same in copy ctors, but
   not going to take time on it as no current use cases and code is
   deprecated)
@briadam
Copy link
Contributor Author

briadam commented Oct 16, 2017

@roystgnr I rebased this on top of your GPMSAOption BPO fix. A quick review wouldn't hurt, but it's a ton of code, and my recent revisions shouldn't be controversial. I'm good for merge when you are.

@@ -109,7 +118,7 @@ class SfpOptionsValues

private:
#ifndef QUESO_DISABLE_BOOST_PROGRAM_OPTIONS
BoostInputOptionsParser * m_parser;
ScopedPtr<BoostInputOptionsParser>::Type m_parser;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot, this fixed a memory leak, didn't it?

Thanks!

@roystgnr
Copy link
Member

Pinging @dmcdougall - IMHO this should be good to merge, but if you've got time it wouldn't hurt to get your eyes on it too first.

@roystgnr
Copy link
Member

@dmcdougall, in person: "I trust you."

I'm sure that'll come back to haunt him later but for now it just means we're ready to merge.

@roystgnr roystgnr merged commit 469911a into libqueso:dev Oct 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants