Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WordPress and Recursive unserialize #114

Open
widoz opened this issue Dec 20, 2017 · 15 comments
Open

WordPress and Recursive unserialize #114

widoz opened this issue Dec 20, 2017 · 15 comments

Comments

@widoz
Copy link
Contributor

widoz commented Dec 20, 2017

Context

After the post in medium about the WordPress and serialize bug that can be read here https://medium.com/websec/wordpress-and-recursive-unserialize-5518b124b23b the fix has been applied to search-and-replace.

What we have now
Now I've refactoring some code and create test for the method recursive_unserialize_replace.
To us this is a crucial part, be sure the bug isn't reintroduced and will never be by add more tests for that logic.

I created a data provide to allow us to pass a data and an expected replaced viewable here https://github.com/inpsyde/search-and-replace/blob/master/tests/UnitTests/Database/ReplaceTest.php#L348

What is needed
It's needed more example of possible serialized data/expected in order to introduce them into the test.

Regarding the bug as described in the article, the $s = "s:+3:\"grr\";"; let out s:3:\”grr\”;.
I'm not pretty sure but the issue is with the missed + or it's about the missed ; at the end of the string?

@bueltge
Copy link
Contributor

bueltge commented Feb 9, 2018

@bueltge bueltge added this to the 3.2.0 milestone Oct 28, 2018
@bvdv
Copy link
Contributor

bvdv commented Nov 22, 2018

I checked function test_recursive_unserialize_replace(ReplaceTest.php) for a, O, C, o, s, S, +, ; (as it described in medium link). Then I added to serializedDataProvider() some data for checking and it works, if I understand correctly.
Data that added besides "a, O, C, o, s, S, +, ;" :

	     [
		'from'     => 'a',
		'to'       => 'b',
		'data'     => 'O:3:%22foo%22:2:{s:4:%22file%22;s:9:%22shell.php%22;s:4:%22data%22;s:5:%22aaaa%22;}',
		'expected' => 's:89:"O:3:%22foo%22:2:{s:4:%22file%22;s:9:%22shell.php%22;s:4:%22d|bt|b%22;s:5:%22|b|b|b|b%22;}";',
	     ],
             [
                 'from'     => 'l',
                 'to'       => 'x',
                 'data'     => 's:8:"last_log";s:19:"making a test entry";',
                 'expected' => 's:8:"xast_xog";',
             ],
             [
                 'from'     => 'x',
                 'to'       => 'x',
                 'data'     => 's:11:"\x00*\x00log_date";s:8:"07-09-17";',
                 'expected' => 's:0:"";',
             ],

@bueltge
Copy link
Contributor

bueltge commented Nov 22, 2018

HI @bvdv
please can you also try with the serialized sting, there was from @widoz mentioned?

$s = "s:+3:"grr";"; let out s:3:\”grr\”;

@bvdv
Copy link
Contributor

bvdv commented Nov 22, 2018

Hi @bueltge,

Sure, I will try.

bvdv added a commit to bvdv/search-and-replace that referenced this issue Nov 23, 2018
Commit related to wp-media#114. More test data for testing recursive_unserialize_replace()
@bvdv
Copy link
Contributor

bvdv commented Nov 23, 2018

I checked $s = "s:+3:\"grr\";"; let out s:3:\”grr\”;. Current version of recursive_unserialize_replace() for 's:+3:\”grr\”;', return 's:18:"s:+3:\”grr\”;";', that is correct, according suggested rules in Medium post. 's:+3:\”grr\”;' is not wp serialized input and not serialized string and so on.

@widoz
Copy link
Contributor Author

widoz commented Nov 24, 2018

Hi @bvdv,

Thanks for the help.
I've a question because isn't clear to me by your message.

The recursive_unserialize_replace is working as expected and the bug has been solved right?
I was unsure about which kind of result we had to expect for that kind of serialized value.

@bvdv
Copy link
Contributor

bvdv commented Nov 25, 2018

Hi @widoz,
I'm sorry for unclear explanation.
's:+3:\”grr\”; is not wp serialized value. If I understand correctly, one of main idea in Medium post that the recursive_unserialize_replace should not perform unserialize on not wp serialized value, from that perspective recursive_unserialize_replace works as expected, bug has been solved. In Medium post there is proposed fix code, this code return same result as current recursive_unserialize_replace. Let me know, if explanation still not quite clear. Sorry.

@widoz
Copy link
Contributor Author

widoz commented Nov 25, 2018

Hi @bvdv,

Thanks for the explanation. I wasn't sure we had to unserialize it or not.

I have a doubt now that concern the replace to 's:+3:\”grr\”;'.
In this case shouldn't we avoid to replace it with 's:18:"s:+3:\”grr\”;";' ?
Seems this will corrupt the original value that would lead to an incorrect result if then that value will be unserialized by third party function.

Probably I'm still missing something.

@bvdv
Copy link
Contributor

bvdv commented Nov 25, 2018

Hi @widoz,

Sorry again, my bad, result for 's:+3:\”grr\”;' is 's:17:"s:+3:\”grr\”;";'.

@widoz
Copy link
Contributor Author

widoz commented Nov 25, 2018

What I meant was, shouldn't 's:+3:\”grr\”;' be replaced with 's:+3:\”aaaaaa\”;' after the replace if I want to replace grr to aaaaaa without being serialized?

Because sounds like this step serialize the string and I think doesn't make sense because the original one is a valid serialized string.

if ( $marker ) {
    $data = maybe_serialize( $data );
}

Even my example above is wrong, technically we should skip strings like that and produce a log for the user that an invalid serialized string was found.
Just an idea.

@bvdv
Copy link
Contributor

bvdv commented Nov 25, 2018

@widoz

's:+3:\”grr\”;' is serialized because function looks like

public function recursive_unserialize_replace( $from = '', $to = '', $data = '', $serialised = true ) {
...
  if ( $serialised ) {
     // @codingStandardsIgnoreLine
     $data = serialize( $data );
  }
...
}

I thought that it is as some possibility for user( or third party function) to serialize or not something even it is invalid value, if the user really wants it. Because of that, I never thought why it being serialized?
Indeed, maybe it is good idea produce log for user in such case.

@widoz
Copy link
Contributor Author

widoz commented Nov 25, 2018

Yes, sorry that is the right code that serialize the string not the one I have posted.
But the doubt remains, the 's:+3:\”grr\”;' should replace as string not as serialized string correct?

Also, something here https://medium.com/websec/wordpress-and-recursive-unserialize-5518b124b23b look different than our implementation.

For example: We are missing the conditional that check if the string is a serialized string and it's serialized. The latest else block replace the normal string so 's:+3:\”grr\”;' should becomes 's:+3:\”aaaaa\”;' .

The problem that this get serialized seems it is because the default value for the serialized parameter is set to true instead of false.

Means probably we have to revisit where and how this functions is called.

I'm wrong or missing something?

@bvdv
Copy link
Contributor

bvdv commented Nov 25, 2018

@widoz,
Regarding to the serialized parameter is set to true, there is commit ee51a4c

But the doubt remains, the 's:+3:\”grr\”;' should replace as string not as serialized string correct?

It depends, according to that commit there was a need to receive serialized data or it was idea to set to false internally. It's hard for me to say right away, what that idea was.

Also, something here https://medium.com/websec/wordpress-and-recursive-unserialize-5518b124b23b look different than our implementation.
For example: We are missing the conditional that check if the string is a serialized string and it's >serialized. The latest else block replace the normal string so 's:+3:\”grr\”;' should becomes >'s:+3:\”aaaaa\”;' .

So far I have considered this function just how it implements suggested rules from both Medium post.
If I understand correctly, maybe implementation not same but suggested rules are implemented. Main idea is Unserialize must not be used on untrusted input.

I'm wrong or missing something?

I don't know, all comments are fair. Currently I think that I missing something. :)

Means probably we have to revisit where and how this functions is called.

Look like implementation from https://medium.com/websec/wordpress-and-recursive-unserialize-5518b124b23b has more checks at once at input.

@widoz
Copy link
Contributor Author

widoz commented Nov 25, 2018

@bvdv I had a look at the code again, could you kindly check the changes I made here https://github.com/inpsyde/search-and-replace/tree/issue/114 ?
It's an idea and I want your opinion because to me sounds like we have to work with the recursive function only and if only the data is previously checked as serialized.

I haven't tested a different solution that could be something like:

if ( $serialised && $unserialized ) {
    // @codingStandardsIgnoreLine
    $data = serialize( $data );
}

Anyway, to me sounds like we have just replace the invalid serialized value as a normal string or do nothing. Assuming that since it's a valid serialized data, who wrote that in the db know what he's doing and in case we convert it to a serialized string or we simply replace the value we are compromising the data. We cannot assume 100% every invalid WP serialized string is a problem.

What do you think?

Sorry I don't remember the commit, was a lot of time ago :/ I need some more time to remember why that decision was made.

@bvdv
Copy link
Contributor

bvdv commented Nov 27, 2018

Hi @widoz,

sorry for my late reply.

I had a look at the code again, could you kindly check the changes I made here https://github.com/inpsyde/search-and-replace/tree/issue/114 ?

Sorry, I not quite understand, why new changed function return unserialized data for valid serialized input? May be for that reason $serialised parameter was set to true, because first if conditional with !is_serialized_string( $data ) never executes for valid serialized input, and $serialised need to be set to true for serialize back changed data. Or not?

if ( $unserialized !== false && ! is_serialized_string( $data ) ) {
    $data = $this->recursive_unserialize_replace( $from, $to, $unserialized);
}

It's an idea and I want your opinion because to me sounds like we have to work with the recursive function only and if only the data is previously checked as serialized.

Seems yes, if it don't assume use recursively maybe_unserialize.

Anyway, to me sounds like we have just replace the invalid serialized value as a normal string or do nothing. Assuming that since it's a valid serialized data, who wrote that in the db know what he's doing and in case we convert it to a serialized string or we simply replace the value we are compromising the data. We cannot assume 100% every invalid WP serialized string is a problem.

I too think that function should just replace the invalid serialized value as a normal string or return some alert that value is invalid but as you think not ever invalid string is a problem, just replace is preferred.

Could you please little bit explain what to do this code.

// Do not allow to return valid serialized data,
// If after replacement data is_serialized then add one | to the replacement.
if ( is_serialized( $data, false ) ) {
    $data = str_replace( $from, '|' . $to, $tmp_data );
}

@bvdv bvdv mentioned this issue Nov 28, 2018
@bueltge bueltge removed this from the 3.2.0 milestone Jan 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants