Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iRODS replication fails when putting files via Davrods from the OS X built-in WevDAV client #12

Open
ilarik opened this issue Dec 12, 2017 · 4 comments

Comments

@ilarik
Copy link

ilarik commented Dec 12, 2017

Hi,

I recently noticed that when putting files via Davrods to a replicated resource in iRODS 4.1.10 or iRODS 4.1.11, the replication fails and the new data object lands only in the first resource server it hits.

The really strange thing is that the failure only occurs when connecting via the OS X built-in WebDAV. Linux webdav clients my colleagues tested (GNOME Nautilus or davfs I suppose) worked perfectly. Also CyberDuck works.

For example:

-bash-4.2$ ils -l
/tempZone/home/rods:
  rods              0 replResc;rodsresc2Resource         5606 2017-12-12.16:23 & bash-history-orig.txt
  rods              1 replResc;rodsresc1Resource         5606 2017-12-12.16:23 & bash-history-orig.txt
  rods              0 replResc;rodsresc2Resource         4096 2017-12-12.16:18 & ._SC17 Final Program.pdf
  rods              1 replResc;rodsresc1Resource         4096 2017-12-12.16:18 & ._SC17 Final Program.pdf
  rods              0 replResc;rodsresc2Resource      5500794 2017-12-12.16:18 & SC17 Final Program.pdf
  rods              0 replResc;rodsresc2Resource      2593877 2017-12-12.16:16 & updates.img
-bash-4.2$ 

Here the file updates.img was uploaded via OS X native client, as well as SC17 Final Program.pdf (which had extended attributes). The file bash-history.orig.txt was uploaded from OS X via CyberDuck. To add insult to injury, iRODS replicated the xattr file but not the actual file, as you can see.

My personal guess is that the OS X WebDAV client does something stupid, which Davrods doesn't expect and causes a misbehavior in the iRODS protocol, which breaks replication.

Apache error log gave something possibly related:

[Tue Dec 12 16:18:41.282197 2017] [dav:error] [pid 12128] [client 192.168.56.1:61710] Unable to deliver content.  [500, #0]
[Tue Dec 12 16:18:41.282231 2017] [dav:error] [pid 12128] (32)Broken pipe: [client 192.168.56.1:61710] Could not write contents to filter.  [500, #0]

The iRODS endpoint server (in this case the iCAT) logs were clean as well as the rodsLog.* files on both of the resource servers.

The server environment was a cleanly built virtual cluster via irods-provisioner on CentOS 7 hosts, davrods-1.3.0 and iRODS 4.1.1.11.

@cjsmeele
Copy link
Contributor

Hi, thanks for the report!
That's a very odd issue. What do you use to trigger replication?
Does replication succeed reliably via davfs and Cyberduck clients when uploading the exact same files?
Also, just to make sure, is rodsresc2Resource your replicated (source) resource or your replica resource?

The httpd error shown seems to be a failed DAV download (perhaps an aborted download?), which should not be related to replication.

@ilarik
Copy link
Author

ilarik commented Dec 23, 2017

Hi,

We currently use the iRODS built-in (synchronous) replication triggered by the replication resource plugin. In the previous test I believe the rodsresc2Resource was the destination resource to which the first child resource in the replication tree (rodsresc1Resource) would replicate the file.

I'll get back to you later with a more rigorous test.

@ilarik
Copy link
Author

ilarik commented Dec 23, 2017

I have now tried to make a more rigorous test which should be fully reproducible from the steps below.

  1. Create test cluster
$ git clone https://github.com/KTH-PDC/irods-provisioner.git -b 4-1-stable
$ cd irods-provisioner
$ vagrant up
$ ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -b -i hosts-test irods-cluster.yml
  1. Create test resource hierarchy and point davrods into it
$ iadmin mkresc replResc replication
$ iadmin addchildtoresc replResc rodsresc1Resource
$ iadmin addchildtoresc replResc rodsresc2Resource
$ ilsresc replResc
replResc:replication
├── rodsresc1Resource
└── rodsresc2Resource
$ vagrant ssh rodsfront1
$ sudo su -
# sed -i -e “s/demoResc/replResc/“ /etc/httpd/conf.d/davrods-vhost.conf
# systemctl restart httpd.service
  1. Generate test file
$ dd if=/dev/urandom of=testfile.dat bs=1048576 count=1024
  1. Generate test hierarchy
$ imkdir repltest
$ icd repltest
$ imkdir osx-webdav
$ imkdir cyberduck
$ imkdir centos6-gnome
$ imkdir centos7-gnome
  1. Upload via OS X (10.11.6) WebDAV into /home/rods/repltest/osx-webdav

  2. Upload via Cyberduck into /home/rods/repltest/cyberduck

  3. Upload via CentOS 6 (latest) GNOME WebDAV into /home/rods/repltest/centos6-gnome

  4. Upload via CentOS 7 (latest) GNOME WebDAV into /home/rods/repltest/centos7-gnome

  5. Check results

$ ils -lr
/tempZone/home/rods:
  C- /tempZone/home/rods/repltest  
/tempZone/home/rods/repltest:
  C- /tempZone/home/rods/repltest/centos6-gnome  
/tempZone/home/rods/repltest/centos6-gnome:
  rods              0 replResc;rodsresc2Resource   1073741824 2017-12-23.22:04 & testfile.dat
  rods              1 replResc;rodsresc1Resource   1073741824 2017-12-23.22:04 & testfile.dat
  C- /tempZone/home/rods/repltest/centos7-gnome  
/tempZone/home/rods/repltest/centos7-gnome:
  rods              0 replResc;rodsresc2Resource   1073741824 2017-12-23.22:10 & testfile.dat
  rods              1 replResc;rodsresc1Resource   1073741824 2017-12-23.22:10 & testfile.dat
  C- /tempZone/home/rods/repltest/cyberduck  
/tempZone/home/rods/repltest/cyberduck:
  rods              0 replResc;rodsresc2Resource   1073741824 2017-12-23.22:02 & testfile.dat
  rods              1 replResc;rodsresc1Resource   1073741824 2017-12-23.22:02 & testfile.dat
  C- /tempZone/home/rods/repltest/osx-webdav  
/tempZone/home/rods/repltest/osx-webdav:
  rods              0 replResc;rodsresc2Resource   1073741824 2017-12-23.22:01 & testfile.dat

It seems that here the rodsresc2Resource was the first server to receive the file, since the replica number was 0. That shouldn't have any significance in this case though, the VM's are identical.

@trel
Copy link

trel commented Feb 10, 2021

We have made progress in the duplicate issue irods/irods#4779.

When we close that issue, we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants