Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using mod_cache: unexpected 304 (NOT MODIFIED) on unconditional GET of valid, expired, cached file #22

Open
tedgin opened this issue Aug 25, 2020 · 1 comment
Labels

Comments

@tedgin
Copy link

tedgin commented Aug 25, 2020

I think I've found a bug in davrods. I have apache configured with mod_cache/mod_cache_disk and mod_dav/davrods. The cache has a copy of an iRODS data object. The copy is valid, meaning the data object in iRODS still exists and hasn't been modified since the copy was cached. The copy is expired though. According to RFC 2616, when a GET request is made for this data object, the apache should check to see if the copy is still valid in iRODS. Since it is, apache should update the expiration time and return the cached copy as the body of a 200 (OK) response. Instead, when I make the GET request, apache returns a 304 (NOT MODIFIED) response with an empty body.

I wasn't certain if this was an issue with davrods or something else. To test if this was a bug in davrods, I added mod_dav_fs based WebDAV repository on the apache server's local filesystem. I performed the same test. When I made the GET request for the file in local WebDAV repository, the cached copy was returned as the body of a 200 response. This implies that the bug is likely in davrods.

I'm using iRODS 4.2.8. For the WebDAV server running on CentOS 7, I'm using apache 2.4.6 and davrods 4.2.8_1.5.0. Here's the virtual host configuration.

<VirtualHost *:80>
  ServerName 128.196.65.41

### MOD_CACHE CONFIGURATION

  CacheDetailHeader  On
  CacheEnable        disk /
  CacheRoot          /var/cache/httpd/proxy
  
  # Have cached files expire quickly
  CacheMaxExpire 1

### 

### MOD_DAV_FS CONFIGURATION

  DavLockDB /var/www/DavLock

  Alias /dav_fs /var/www/webdav

  <Location /dav_fs/>
    AuthType  None
    Require   all granted

    Dav On
  </Location>

###

### DAVRODS CONFIGURATION

  <Location /davrods/>
    AuthType  None
    Require   all granted

    Dav davrods-locallock

    DavRodsEnvFile         /etc/httpd/irods/irods_environment.json
    DavRodsServer          128.196.65.131 1247
    DavRodsZone            cyverse.k8s
    DavRodsAnonymousMode   On
    DavRodsAnonymousLogin  "anonymous" ""
    DavRodsExposedRoot     /cyverse.k8s/home/shared
    DavRodsLockDB          /var/lib/davrods/lockdb_locallock

    DirectoryIndex disabled
  </Location>

###

</VirtualHost>

Here's a curl based example of how mod_dav_fs responds to caching. Notice that when retrieving a file from the cache when the cached copy is expired but still valid, it refreshes the cached copy and returns as the body of a 200 response.

prompt> curl -v 128.196.65.41/dav_fs/MOTD
*   Trying 128.196.65.41...
* TCP_NODELAY set
* Connected to 128.196.65.41 (128.196.65.41) port 80 (#0)
> GET /dav_fs/MOTD HTTP/1.1
> Host: 128.196.65.41
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Tue, 25 Aug 2020 18:58:31 GMT
< Server: Apache/2.4.6 (CentOS)
< Last-Modified: Tue, 25 Aug 2020 18:21:52 GMT
< Content-Length: 20
< ETag: "14-5adb7c6fc1c8e"
< Accept-Ranges: bytes
< X-Cache-Detail: "conditional cache hit: entity refreshed" from 128.196.65.41
< 
Hi from mod_dav_fs!
* Connection #0 to host 128.196.65.41 left intact

Here's one for how davrods responds to caching. Notice that when retrieving a file from the cache when the cached copy is expired but still valid, it returns a 304 response.

prompt> curl -v 128.196.65.41/davrods/MOTD
*   Trying 128.196.65.41...
* TCP_NODELAY set
* Connected to 128.196.65.41 (128.196.65.41) port 80 (#0)
> GET /davrods/MOTD HTTP/1.1
> Host: 128.196.65.41
> User-Agent: curl/7.58.0
> Accept: */*
> 
< HTTP/1.1 304 Not Modified
< Date: Tue, 25 Aug 2020 18:58:46 GMT
< Server: Apache/2.4.6 (CentOS)
< ETag: "11-01598381793"
< 
* Connection #0 to host 128.196.65.41 left intact
@cjsmeele
Copy link
Contributor

That most certainly is protocol-breaking behavior yes, thank you for the detailed report.

The 304 appears to be generated by mod_dav.
How Davrods can influence this behavior is something that requires more investigation.
I'll share what I found so far, below.

As you kindly contributed yourself on the iRODS list, if you are using mod_cache, using a reverse proxy in front of mod_dav/davrods can be a workaround: https://groups.google.com/d/msg/irod-chat/7I9n4ADkkA8/I-c2D4guCAAJ


In my current understanding:

  • Client sends a GET request
  • When revalidation is needed, a mod_cache filter inserts conditional request headers
  • Davrods emits ETag and modify date for the resource (which is normal)
  • mod_dav checks for conditional requests (calls ap_meets_conditions) and returns 304 because e.g. the ETag matches
  • Then somehow the 304 returned by mod_dav is not picked up by mod_cache (as it apparently is in the mod_dav_fs case) but passed directly back to the HTTP client.

I suspect that the difference with davfs lies somewhere in the fact that davfs delegates GET request handling to Apache core, while Davrods must provide its own handler since it can't use the filesystem.

Furthermore, I can see that the code path in mod_dav that generates these 304 responses, dav_method_get, is never followed for mod_dav_fs requests (because mod_dav does not install itself as a request handler in that case). This would be part of the explanation why local filesystem-based dav works without issues.

For anyone looking, the relevant bit of code in Davrods is in dav_repo_set_headers, which is a function called by mod_dav directly.

@cjsmeele cjsmeele changed the title 304 (NOT MODIFIED) response for an unconditional GET request of valid, expired, cached file Using mod_cache: unexpected 304 (NOT MODIFIED) on unconditional GET request of valid, expired, cached file Aug 25, 2020
@cjsmeele cjsmeele changed the title Using mod_cache: unexpected 304 (NOT MODIFIED) on unconditional GET request of valid, expired, cached file Using mod_cache: unexpected 304 (NOT MODIFIED) on unconditional GET of valid, expired, cached file Aug 25, 2020
@cjsmeele cjsmeele added the bug label Aug 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants