Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GODRIVER-2986 Resolve failures in Race Detector Test #1380

Merged
merged 5 commits into from
Sep 15, 2023

Conversation

prestonvasquez
Copy link
Collaborator

@prestonvasquez prestonvasquez commented Sep 12, 2023

GODRIVER-2986

Summary

  • Extend the change stream timeout in the "split large changes" change stream prose test added in GODRIVER-2827
  • Guard read / write to the UST logger's lastOrder field, which is incremented every time the "info" method is called

Background & Motivation

Comment on lines 47 to 54
log.mu.Lock()
defer log.mu.Unlock()

if log.logQueue == nil {
return
}

defer func() { log.lastOrder++ }()
defer func() { atomic.AddUint64(&log.lastOrder, 1) }()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear what log.mu guards access to and why we have to use atomic to increment log.lastOrder here but not compare or add log.lastOrder below. We should either use atomic to access shared values or synchronize the whole function, but not both.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I think it's safest to just keep the lock.

Comment on lines 47 to 54
log.mu.Lock()
defer log.mu.Unlock()

if log.logQueue == nil {
return
}

defer func() { log.lastOrder++ }()
defer func() { atomic.AddUint64(&log.lastOrder, 1) }()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: If you decide to synchronize access only to lastOrder, consider using an atomic.Int32 instead of a uint64. That would simplify the syntax, make checking for overflow easier, and make conversion to int safer (int32 -> int is always safe for the supported platforms).

E.g. simplified logic using an atomic.Int32:

order := log.lastOrder.Add(1)

// If lastOrder overflows, it will become negative. There are no
// other expected conditions where lastOrder should be negative.
if order < 0 {
	panic("lastOrder overflow")
}

if order > log.bufSize {
	return
}

log.logQueue <- orderedLogMessage{
	order:      int(order),
	logMessage: logMessage,
}

Even better, order is an intrinsic characteristic of a message on a channel. We don't actually need to atomically increment and record the order number here. Instead, we can increment an order number when we read the message off the channel here.

E.g.

go func() {
	defer close(orderedLogCh)
	defer close(unorderedLogCh)

	order := 0
	for actual := range logQueue(ctx, exp) {
		msg := actual.logMessage
		if _, ok := unorderedIndices[order]; ok {
			unorderedLogCh <- msg
		} else {
			orderedLogCh <- msg
		}
		order++
	}
}()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order is principally used to determine when to close the logQueue channel, here.

}

func newLogger(olm *observeLogMessages, bufSize int) *Logger {
func newLogger(olm *observeLogMessages, bufSize uint64) *Logger {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: If the number of messages sent will never exceed a 32-bit integer (and unexpected overflow is checked), it's safe to keep bufSize an int and do all the numeric conversions within the Info method.

Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍 .

I'd still like to understand why deferring incrementing causes a data race but incrementing at the end of the function doesn't, but we can defer that question.

@qingyang-hu
Copy link
Collaborator

qingyang-hu commented Sep 14, 2023

I'd still like to understand why deferring incrementing causes a data race but incrementing at the end of the function doesn't, but we can defer that question.

Let's add a TODO there.

@prestonvasquez
Copy link
Collaborator Author

prestonvasquez commented Sep 14, 2023

@qingyang-hu @matthewdale I think the defer theory is a red herring. The fact that we read and write to the order while "info" could (and is) used by multiple go routines will eventually result in a race condition regardless of where we increment the order. See this Go Playground example illustrating this behavior: https://go.dev/play/p/KfdcEe8G6qm

And this one where the problem is resolved by guarding the R/W of order: https://go.dev/play/p/sdAopPVwtMy

I think we should add a mutex to the "info" method to guard the order.

@prestonvasquez prestonvasquez changed the title Resolve failures in Race Detector Test GODRIVER-2986 Resolve failures in Race Detector Test Sep 15, 2023
Copy link
Collaborator

@matthewdale matthewdale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@prestonvasquez prestonvasquez merged commit 59f7519 into mongodb:v1 Sep 15, 2023
23 of 24 checks passed
@prestonvasquez prestonvasquez deleted the resolve-race-test-failures branch September 15, 2023 22:39
blink1073 pushed a commit that referenced this pull request Sep 15, 2023
blink1073 pushed a commit that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
blink1073 pushed a commit to blink1073/mongo-go-driver that referenced this pull request Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants