Add cost func #3

Gobd · 2024-04-19T18:06:18Z

So much formatting sry :( Maybe this would work for benchmarks to be more fair, allow user supplied cost func to be more like other caches for benchmark.

Don't merge this since I changed the go.mod, just an example.

kolinfluence · 2024-04-19T20:34:10Z

@Gobd i've explained the "cost" factor on reddit which i think is unnecessary additional added overhead cost to this lru, if you can do the ttl version as "cost" will be great. Will accept that as a pull then.
https://www.reddit.com/r/golang/comments/1c6rv7y/fastest_zero_allocation_golang_lru_package/

Can you check the reddit thread and maybe explain why this cost factor is necessary?

Having said so i can do a branch that does cost factor though or separate it out for special use case.

Gobd · 2024-04-19T20:44:58Z

I think your reddit account and all posts you made are removed :( I don't think this adds any overhead and starts an options pattern that could be used for other things too, like how many shards or whatever else

kolinfluence · 2024-04-19T21:07:39Z

@Gobd
here's the reddit post filtered.

i have no idea why they deleted the post, it was my first time posting on reddit for golang stuff.

i'll need to check out what this cost factor is about.
to me, an lru cache means all items are important. you never know which item should have a higher cost than the rest and the "cost/weight" depends on how frequently it is used. so the more frequently used, the higher the rank it should be.

the "cost" factor that i will implement in future will be based on TTL. so the greater the ttl, the more important it is? anyway, TTL will be added in future.

i still cant wrap my head around a "cost" factor on ttl LRU for now. i think this additional overhead of processing for a "cost" adds extra cost to the lru.

This is what may make Accelru stand out and be fair to lru comparison.
cpu to cpu and mem to mem use. that'll be fairer to compare.

when you limit by item capacity, you are not constraining to the device's resources available. but if you do so, that's where this lru will seemed "better", afterall, it's the memory limitation and how it's used as cache.

p.s. : let's use the "lowest memory allocation used by the others (other than phuslu/lru which doesnt have mem alloc)" by each category for testing against this LRU for the total cache. How about this for comparison? Let's see what happens.

one more thing. the lower bound of ~ 1.3MB memory allocation used by otter with 10000 items, accelru key value should be within it. so I can take this as the memory used. just put NewLRUCache(1300000,1). I'm curious about the hit ratio only then. then total memory used for accelru will be confined to 1.3mb for the 10k items. (which honestly, i think it's a bit too much. i would just use 150k mem for a 10k item lru cache of integers. but that's just me.) i can even estimate the hit ratio without actual runs, guess i'm quite a Pareto principle believer.

I'll take the lowest bound of "memory allocation used" by the cached compared (only for the ones u are comparing today) as cache capacity size for each of the sections you have done.

i can live with this. (just the lowest memory allocation used by each of the category will do)

i wont speculate why the post is deleted but if you can, do help post your findings on reddit etc.
You seemed very into lru. When you do have the findings, do mentioned, your benchmark work is great.

instead of deleting the thread, they deleted the whole post and filtered content preventing any mention of the repo.
I thought they would at least just delete only the thread.

Gobd · 2024-04-19T21:40:43Z

Here's what 2 hit ratio benchmarks from there look like with cost of an item fixed at 1 using my branch. You can compare to the results at https://github.com/Gobd/benchmarks for these 2 traces to see the difference when treated more fairly compared to the other caches.

kolinfluence · 2024-04-19T21:54:49Z

@Gobd what was the NewLRUCache settings used?

Can you show the chart for this? i only modified this, capacity * 40bytes


func (c *Cloudxaas) Init(capacity int) {
        client := lru.NewLRUCache(int64(capacity*40), 1)
        c.client = client
}

func (c *CloudxaasString) Init(capacity int) {
        client := lru.NewLRUCache(int64(capacity*40), 1)
        c.client = client
}

i have issue with simulator chart generation, what software to install to see the chart for this? See the error message at the bottom

./bench.sh
2024/04/19 18:44:35 Simulation for cache phuslu at capacity 1000000 completed with hit ratio 72.92%
2024/04/19 18:44:35 Simulation for cache otter at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:35 Simulation for cache lru at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:36 Simulation for cache cloudxaas at capacity 1000000 completed with hit ratio 89.70%
2024/04/19 18:44:36 Simulation for cache lru at capacity 1000000 completed with hit ratio 72.95%
2024/04/19 18:44:37 Simulation for cache otter at capacity 1000000 completed with hit ratio 74.79%
2024/04/19 18:44:38 Simulation for cache arc at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:41 Simulation for cache arc at capacity 1000000 completed with hit ratio 76.49%
2024/04/19 18:44:42 Simulation for cache theine at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:44 Simulation for cache theine at capacity 1000000 completed with hit ratio 77.31%
2024/04/19 18:44:50 Simulation for cache phuslu at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:50 Simulation for cache otter at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:51 Simulation for cache cloudxaas at capacity 2000000 completed with hit ratio 89.70%
2024/04/19 18:44:52 Simulation for cache lru at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:55 Simulation for cache arc at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:55 Simulation for cache phuslu at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:57 Simulation for cache theine at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:57 Simulation for cache cloudxaas at capacity 3000000 completed with hit ratio 89.70%
2024/04/19 18:44:57 Simulation for cache ristretto at capacity 1000000 completed with hit ratio 2.15%
2024/04/19 18:44:58 Simulation for cache otter at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:02 Simulation for cache ristretto at capacity 2000000 completed with hit ratio 4.13%
2024/04/19 18:45:05 Simulation for cache lru at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:08 Simulation for cache arc at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:08 Simulation for cache phuslu at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:09 Simulation for cache cloudxaas at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:10 Simulation for cache theine at capacity 4000000 completed with hit ratio 89.70%
2024/04/19 18:45:11 Simulation for cache otter at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:12 Simulation for cache lru at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:17 Simulation for cache theine at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:18 Simulation for cache arc at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:19 Simulation for cache phuslu at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:20 Simulation for cache ristretto at capacity 3000000 completed with hit ratio 5.66%
2024/04/19 18:45:22 Simulation for cache otter at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:23 Simulation for cache cloudxaas at capacity 5000000 completed with hit ratio 89.70%
2024/04/19 18:45:24 Simulation for cache lru at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:28 Simulation for cache arc at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:29 Simulation for cache theine at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:31 Simulation for cache phuslu at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:32 Simulation for cache otter at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:32 Simulation for cache cloudxaas at capacity 6000000 completed with hit ratio 89.70%
2024/04/19 18:45:36 Simulation for cache lru at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:36 Simulation for cache ristretto at capacity 4000000 completed with hit ratio 7.63%
2024/04/19 18:45:38 Simulation for cache theine at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:40 Simulation for cache arc at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:42 Simulation for cache phuslu at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:43 Simulation for cache cloudxaas at capacity 7000000 completed with hit ratio 89.70%
2024/04/19 18:45:43 Simulation for cache otter at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:43 Simulation for cache ristretto at capacity 5000000 completed with hit ratio 9.46%
2024/04/19 18:45:47 Simulation for cache ristretto at capacity 6000000 completed with hit ratio 10.80%
2024/04/19 18:45:48 Simulation for cache lru at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:49 Simulation for cache theine at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:50 Simulation for cache arc at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:50 Simulation for cache ristretto at capacity 7000000 completed with hit ratio 12.84%
2024/04/19 18:45:51 Simulation for cache phuslu at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:51 Simulation for cache cloudxaas at capacity 8000000 completed with hit ratio 89.70%
2024/04/19 18:45:53 Simulation for cache ristretto at capacity 8000000 completed with hit ratio 15.50%
2024/04/19 18:45:53 All simulations are complete
|   CACHE   | 1,000,000 | 2,000,000 | 3,000,000 | 4,000,000 | 5,000,000 | 6,000,000 | 7,000,000 | 8,000,000 |
|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
| otter     |     74.79 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
| theine    |     77.31 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
| ristretto |      2.15 |      4.13 |      5.66 |      7.63 |      9.46 |     10.80 |     12.84 |     15.50 |
| lru       |     72.95 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
| arc       |     76.49 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
| phuslu    |     72.92 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
| cloudxaas |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |     89.70 |
2024/04/19 18:45:53 simulate trace: create report: save chart: page load error net::ERR_FILE_NOT_FOUND
exit status 1

Gobd · 2024-04-19T22:57:20Z

lru.NewLRUCache(int64(capacity), 1, lru.WithCostFunc(func(key, value []byte) int64 {
		return 1
	}))

So same as others, limit based on items not memory.

kolinfluence · 2024-04-19T23:35:50Z

@Gobd what about memory to memory used comparison?
I'll think about how to optimize it in the way you mentioned.
It's not current priority but will work on that in future.

curious, possible to see any issues with same amount of memory used by like others on the hit ratio?
given same memory used without additional modification to original code.

kolinfluence · 2024-04-19T23:48:54Z

@Gobd i thought about it and your initial capacity is still capped at memory used. e.g. 10000 = 10kb memory size. so the lower hit ratio is expected. you have to increase the initial capacity as mem size too

kolinfluence · 2024-04-20T00:04:59Z

@Gobd if you really want to test the capacity limit then the way should be
total bytes of each item key, value * items capacity = memory used sized capacity
this will be a closer approximation to the number of items that will fit into the memory.

if all the data of key value is equal in size, the result will be very accurate.
if it's totally uneven with high deviations, measure by total memory used by the program is much better gauge of hit ratio.

pls check my remarks and publish the findings for total memory used is best to justify the effectiveness of cache for golang use.

kolinfluence · 2024-04-20T19:46:25Z

@Gobd i've updated a faster, more memory efficient cache for your test, you can self define your own hashing algo in this one.

https://github.com/cloudxaas/gocache/tree/main/lrux/bytes

Gobd added 2 commits April 19, 2024 12:05

Add cost func

db78ac5

Change module for testing

d7c8d86

Gobd marked this pull request as draft April 19, 2024 18:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cost func #3

Add cost func #3

Gobd commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

Gobd commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

Gobd commented Apr 19, 2024

kolinfluence commented Apr 19, 2024 •

edited

Loading

Gobd commented Apr 19, 2024

kolinfluence commented Apr 19, 2024

kolinfluence commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 20, 2024 •

edited

Loading

kolinfluence commented Apr 20, 2024

Add cost func #3

Are you sure you want to change the base?

Add cost func #3

Conversation

Gobd commented Apr 19, 2024 • edited Loading

kolinfluence commented Apr 19, 2024 • edited Loading

Gobd commented Apr 19, 2024 • edited Loading

kolinfluence commented Apr 19, 2024 • edited Loading

Gobd commented Apr 19, 2024

kolinfluence commented Apr 19, 2024 • edited Loading

Gobd commented Apr 19, 2024

kolinfluence commented Apr 19, 2024

kolinfluence commented Apr 19, 2024 • edited Loading

kolinfluence commented Apr 20, 2024 • edited Loading

kolinfluence commented Apr 20, 2024

Gobd commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

Gobd commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 19, 2024 •

edited

Loading

kolinfluence commented Apr 20, 2024 •

edited

Loading