Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to mitigate sync-stop issues #646

Open
EugenMayer opened this issue Feb 25, 2019 · 16 comments
Open

Try to mitigate sync-stop issues #646

EugenMayer opened this issue Feb 25, 2019 · 16 comments

Comments

@EugenMayer
Copy link
Owner

Right now, there is no way around, that sync can stop. Period :)

There are different reasons for that and we can only try to "work arround" or mitigate the impact of one, the other is d4m related and we cannot do anything about it.

Scenarios where this happens ( more often ):

  • npm / webpack / gulp based projects where during watch / rebuild a lot of files are removed and created -> we cannot do anything in this case
  • npm ci / fresh npm install: same as above -> Try to always do that before you start the stack ( docker sync )
  • sometimes also composer install is enough -> also try to run this before you start the stack

In most cases, this has nothing to do with docker-sync at all, but with OSXFS getting stucked in the FS event queue, which then also stops events for unison in our docker image ( linux, so inode events ) and thus breaks syncing.

There is no fix to that - sometimes restarting docker sync helps, this means, that is an actual unison "too many events" issue, which can also happen for unison.

So what we have here:
a) OSXFS gets stuck -> docker for mac restart ist the only option, sometimes even only OS restart
b) unsion gets stuck -> restarting docker-sync with docker-sync stop && docker-sync clean && docker-sync start helps

There might be a way to auto-detect case b) since we already have monit ( AFAIK some already do that ) and then auto-heal itself.

We neither can avoid nor fix b) in general - it's not even sure it would need reimplemenation in Unison or its actually a OSXFS vs Inode event propagation related issue.

@EugenMayer
Copy link
Owner Author

@unnawut i would also add a documentation part to that, documenting the most common sync-stop issues and how to work arround them like npm install or compose install or simply choosing proper project layouts and sync excludes

@derschatta
Copy link

derschatta commented Feb 25, 2019

For the record: For me it seems that only issue b) happens. But for me doing a docker-sync stop && docker-sync start gets the sync back working. I never have to do a docker-sync clean nor a docker restart to get it back working.

@grigoryosifov
Copy link

Not sure if that has been reported so maybe I would repeat someone else but we're working around this by manually going to the unison docker container and executing kill -1 [PID] on the unison process. Then after 2-3 minutes it completes going over the huge Symfony codebase that we have and continues to work normally.

@jtborger
Copy link

Perhaps not the right place to report this, but I ran into the 'stop syncing' issue and googled upon several issue reports.
What I forgot is that I moved my main shared directory (set in Preferences of Docker) and added a symlink to the previous location.
(I moved it out of my ~/Documents directory).
All worked well, but not smooth and often no syncs. After I added the new location directly, so not the symlinked path, in Docker preferences, all started working just fine...
Please dismiss this if not at all any helpful here...

Thanks for your work Eugen!

@whoisstan
Copy link

Hi,
when docker-sync worked it was a heaven sent. Way way faster. But now I am facing the issue with it working anymore. When my app starts there are no sweeping changes like they can be triggered with 'npm i' or build tasks.

I am seeing this issue here:
https://docker-sync.readthedocs.io/en/latest/getting-started/troubleshooting.html#reproduction-verify-your-changes-have-been-sync-by-unison-app-sync

The only way I can sync by running
docker-sync stop && docker-sync start
in a separate shell. Which is still faster and less resource intensive in my case than regular volumes.

My fix for now has been this:
fswatch app/ | while read num ; do docker-sync clean && docker-sync start done

I put that in a shell script and run it when I start working on the app. Only calling 'docker-sync sync' is not doing anything.

Love to get it all working without those band aids:) Happy to provide more details or a sample project.

Thank you,
Stan

@EugenMayer
Copy link
Owner Author

EugenMayer commented Aug 4, 2019

@whoisstan thanks for that information. Well there are all sorts of those all arround to basically mittigate the issue of fswatch events being stuck on OSX or if native_osx, OSXFS events stuck on linux - pick the lesser death to die here.

We cannot really do anything here - i think currently, the best way to go here is using monit on native_osx to try to detect when OSXFS goes banana - restart it cleanly without user intervention.

But one of the best ways to remove those issues is:

  • never run npm i composer install and alikes AFTER docker-sync start - run them before ( so no FS events are tracked before -> it does not stuck )
  • avoid watching your whole project folder - watch ony needed src folders - restructure your project layout if needed for that ( yes, that is PITA, but worth it )

@whoisstan
Copy link

@EugenMayer maybe this has to be strongly called in the documentation. kind of like realistic/practical guide to using docker-sync. it's still way faster and less heavy then docker volumes.

@erikologic
Copy link

erikologic commented Aug 5, 2019

you might want to check the cached/delegated bind options while using docker volumes!? for a small project they do just fine
docker-sync was heaven-sent for doing things like npm install in a container to me

@whoisstan
Copy link

whoisstan commented Aug 5, 2019 via email

@unnawut
Copy link
Contributor

unnawut commented Aug 6, 2019

maybe this has to be strongly called in the documentation. kind of like realistic/practical guide to using docker-sync

Eugen suggested me to add a troubleshooting guide some time ago but I havn't got around to that yet. Will do within the next few days. Feel free to nudge if that doesn't happen.

@unnawut
Copy link
Contributor

unnawut commented Aug 21, 2019

Hi all. I'm pretty sure I'm missing some troubleshooting ideas.

Feel free to suggest your ideas in this PR: #672. I'll be happy to acknowledge you if there's a public link to the suggestion that I could link to.

@Mathews2115
Copy link

Hi everyone, I'm sorry to spam but for those "docker-sync is running but not syncing" issues;

I wanted to add one more troubleshooting step that worked for me! In my case, it was actually a Unison issue, not a docker-sync problem: hnsl/unox#24

brew unlink unox && brew link unox - solved my issue immediately.

I hope this helps. Again, sorry for potential spam, I just wanted to record this down somewhere for other people that may be in the same spot.

Thanks for Docker Sync, it has been a life saver for us.

@ostrolucky
Copy link
Contributor

One more thing we can do is restart sync client when it crashes/run it via supervisord. Currently this is implemented for servers only. I'm working on improving reliability of docker-sync (pretty much this issue) and sync client can also crash for various reasons, while other things are fine. See e.g. hnsl/unox#26 and bcpierce00/unison#336

@EugenMayer
Copy link
Owner Author

@ostrolucky be aware that we have different options for different strategies

  • native_osx: server restart only ( can be used via monit on the unison container already ). Can have several more issues like stuck inotify event (unison server restart) and stuck OSFS (docekr engine restart) or even stuck FS events ( macOS restart )
  • unison: server restart ( monit ), unison client restart ( harder without more client dependencies ATM ), even stuck FS events ( macOS restart )
  • rsync: srync client restart ( harder without more client dependencies ATM ), even stuck FS events ( macOS restart )

So this is a little more complicated due to the different strategies

@ostrolucky
Copy link
Contributor

ostrolucky commented Apr 26, 2020

Unison client used inside native_osx strategy needs to be restarted in case of failure as wel.

unison: server restart ( monit )

Supervisord is taking this role, not monit. My post is about program crashing, not when program is stuck on high cpu usage, which I personally didn't experience happening yet.

harder without more client dependencies

Well simple, naive solution is something like while 1;do unison xxx;done. That's no additional deps. But yeah ideally we would use something like supervisord for clients as well, but that shouldn't be too big of problem. Dependency could also be made optional if you are not comfortable with making users install more stuff.

So this is a little more complicated due to the different strategies

From what i can tell this is just about wrapping binary execution of each sync client in a (same) function, so it should be pretty consistent over all strategies and not very complex.

@EugenMayer
Copy link
Owner Author

Unison client used inside native_osx strategy needs to be restarted in case of failure as wel.

no, when using the unison-strategy, there is no OSXFS mount, it is a server-to-client sync using macOS fs events on the client and linux inode events on the container side ( server )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants