Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

Failure creating flashcache with 1 sector blocksize #198

Open
FlorianHeigl opened this issue Dec 20, 2014 · 11 comments
Open

Failure creating flashcache with 1 sector blocksize #198

FlorianHeigl opened this issue Dec 20, 2014 · 11 comments

Comments

@FlorianHeigl
Copy link

Hi,

I want to optimize flashcache for use in a Xen host. The hit rates running mixed-size IO were, well, shitty.

dmsetup table with iosize stats from flashcache shows that the IO's i'm getting from vms are not 4KB aligned only. At least (phew) they are 512byte increments.
Because of that I tried the following:

# flashcache_create -p back -s 32G -b 1 flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 8, cache_size 67108864
Flashcache metadata will use 1408MB of your 6852MB main memory
device-mapper: reload ioctl on flashcache_slc failed: Invalid argument
Command failed
echo 0 7808589824 flashcache /dev/sdb /dev/sdc flashcache_slc 1 2 1 67108864 512 512 8 | dmsetup create flashcache_slc failed

This seems to correspond with the following message in dmesg:

[ 1482.536577] Invalid Disk Assoc assoc 512 disk_assoc 512 size 67108864
[ 1482.536598] device-mapper: table: 253:5: flashcache: flashcache: Invalid disk associativity
[ 1482.536601] device-mapper: ioctl: error adding target to table

I've raised the dom0 memory to >6GB and flashcache's calculation says it'll need about 1.4GB:

Flashcache metadata will use 1408MB of your 6852MB main memory

Is this a bug or do i need to adjust some other value to make it worth, formula-wise?

@mohans
Copy link
Contributor

mohans commented Dec 21, 2014

This looks like a bug in the disk assoc parameter checking. Unfortunately I am without access to a computer until Dec 31 and can't look at it until then.

You could try playing with the disk assoc arg to work around (try to disable disk assoc completely - does 0 do it - I don't recall and would have to look at the code).

Sent from my iPhone

On Dec 21, 2014, at 4:08 AM, Florian Heigl [email protected] wrote:

Hi,

I want to optimize flashcache for use in a Xen host. The hit rates running mixed-size IO were, well, shitty.

dmsetup table with iosize stats from flashcache shows that the IO's i'm getting from vms are not 4KB aligned only. At least (phew) they are 512byte increments.
Because of that I tried the following:

daveh0003:~# flashcache_create -p back -s 32G -b 1 flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 8, cache_size 67108864
Flashcache metadata will use 1408MB of your 6852MB main memory
device-mapper: reload ioctl on flashcache_slc failed: Invalid argument
Command failed
echo 0 7808589824 flashcache /dev/sdb /dev/sdc flashcache_slc 1 2 1 67108864 512 512 8 | dmsetup create flashcache_slc failed

This seems to correspond with the following message in dmesg:
[ 1482.536577] Invalid Disk Assoc assoc 512 disk_assoc 512 size 67108864
[ 1482.536598] device-mapper: table: 253:5: flashcache: flashcache: Invalid disk associativity
[ 1482.536601] device-mapper: ioctl: error adding target to table

I've raised the dom0 memory to >6GB and flashcache's calculation says it'll need about 1.4GB:

Flashcache metadata will use 1408MB of your 6852MB main memory

Is this a bug or do i need to adjust some other value to make it worth, formula-wise?


Reply to this email directly or view it on GitHub.

@FlorianHeigl
Copy link
Author

Thanks a lot Mohan,

I had looked at the assoc parameter check but didn't fully understand what it's doing.
I'll poke around a little & update if I find something.

It seems to also occur if I use -m (md block size); I had been trying to align on the 128KB-striped hw raid there.

@FlorianHeigl
Copy link
Author

Tested these two, no luck so far.

daveh0003:~# flashcache_create -p back -b 1 -a 0 -m 128 -s 64G flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 128, cache_size 134217728
Flashcache metadata will use 2816MB of your 6852MB main memory
Proportion of main memory needed for flashcache metadata is high.
You can reduce this with a smaller cache or a larger blocksize.
Are you sure you want to proceed ? (y/n): y

flashcache_create: Invalid Disk Associativity 512

setting to 1

daveh0003:~# flashcache_create -p back -b 1 -a 1 -m 128 -s 64G flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 128, cache_size 134217728
Flashcache metadata will use 2816MB of your 6852MB main memory
Proportion of main memory needed for flashcache metadata is high.
You can reduce this with a smaller cache or a larger blocksize.
Are you sure you want to proceed ? (y/n): y

flashcache_create: Invalid Disk Associativity 512

@FlorianHeigl
Copy link
Author

Gonna find someone who safely reads C and then poke at it more. Your comment in the code clearly says that disk assoc "0" should work.
I'm at 31c3 at the moment, probably 1000 people around here that can help and I just need to find the 1 or 2 i can also trust ;)

@mohans
Copy link
Contributor

mohans commented Dec 27, 2014

I am still without Internet/computer access save for my phone. I will take a look next Wednesday - unless you fix it by then...

Sent from my iPhone

On Dec 27, 2014, at 7:27 PM, Florian Heigl [email protected] wrote:

Tested these two, no luck so far.

daveh0003:~# flashcache_create -p back -b 1 -a 0 -m 128 -s 64G flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 128, cache_size 134217728
Flashcache metadata will use 2816MB of your 6852MB main memory
Proportion of main memory needed for flashcache metadata is high.
You can reduce this with a smaller cache or a larger blocksize.
Are you sure you want to proceed ? (y/n): y

flashcache_create: Invalid Disk Associativity 512
setting to 1

daveh0003:~# flashcache_create -p back -b 1 -a 1 -m 128 -s 64G flashcache_slc /dev/sdc /dev/sdb
cachedev flashcache_slc, ssd_devname /dev/sdc, disk_devname /dev/sdb cache mode WRITE_BACK
block_size 1, md_block_size 128, cache_size 134217728
Flashcache metadata will use 2816MB of your 6852MB main memory
Proportion of main memory needed for flashcache metadata is high.
You can reduce this with a smaller cache or a larger blocksize.
Are you sure you want to proceed ? (y/n): y

flashcache_create: Invalid Disk Associativity 512

Reply to this email directly or view it on GitHub.

@FlorianHeigl
Copy link
Author

@mohans enjoy your time offline. I'll post if i find relevant update. For now I did change to 2KB which should already improve hit rates by some margin.

@FlorianHeigl
Copy link
Author

Poked around with it a little today, no luck yet.
The -d seems halfway like a leftover from v2, it's not showing in the online help but still accessible. Setting it has no positive effect :)

Lets chat next year, happy new year.

@cofol1986
Copy link

Hi @FlorianHeigl @mohans ,
The problem sits in the check of disk_assoc in function flashcache_ctr of flashcache_conf.c, I fix this bug and pull the request.
If we change the block size of cache to 1 sector, seems the life of ssd and the performance of flashcache will drop dramatically, since ssd's erase unit usually is 2k or 4k, and device mapper will break the request to 512 bytes, this will cause large number of requests and finally lead to unchached io(flashcache_do_pending_noerror) .
Dont forget to set the associativity to larger number(than 512).

@mohans
Copy link
Contributor

mohans commented Jan 5, 2015

Hi

Florian - Sorry for the late response. Are you running flashcache trunk ? I tried to repro your issue at my end with a blocksize of 1 sector, but have been unable to repro it (with trunk). The puzzling thing is that the default disk_assoc passed in by the flashcache_create wrapper into the module is 0, so the code in the kernel module that causes the create to fail should have never executed at all. Tracking this down, I found one issue in flashcache_create.c that could cause garbage values of disk_assoc being passed into the module. It is a sprintf formatting issue fixed by

diff --git a/src/utils/flashcache_create.c b/src/utils/flashcache_create.c
index 380b5ca..f935ea9 100644
--- a/src/utils/flashcache_create.c
+++ b/src/utils/flashcache_create.c
@@ -358,7 +358,7 @@ main(int argc, char **argv)
ssd_devname, disk_devname);
check_sure();
}

  •   sprintf(dmsetup_cmd, "echo 0 %lu flashcache %s %s %s %d 2 %lu %lu %d %lu %d %lu"
    
  •   sprintf(dmsetup_cmd, "echo 0 %lu flashcache %s %s %s %d 2 %lu %lu %d %d %d %lu"
            " | dmsetup create %s",
            disk_devsize, disk_devname, ssd_devname, cachedev, cache_mode, block_size, 
            cache_size, associativity, disk_associativity, write_cache_only, md_block_size,
    

This should cause a 0 value for disk_assoc passed into the module (disk assoc disabled), which should not trigger failure. Would you be able to quickly try this and tell me if that works for you ?

cofol1986 - The check for disk_assoc is indeed broken. The intent there was to make sure that the disk_assoc configured was not larger than that the cache_set assoc. I am not sure that '(dmc->assoc * dmc->block_size) < dmc->disk_assoc)' fixes that though (we can compare dmc->assoc * dmc->block_size with dmc->disk_assoc * dmc->blocksize). I think we can also just check for '(dmc->assoc < dmc->disk_assoc)' - both of those quantities are in terms of flashcache blocks.

There are a few other issues - with a blocksize < 1KB, a few items in dmsetup table will show up wrongly as 0 (because those are expressed in KB), those would need to be fixed as well.
mohan


From: cofol1986 [email protected]
To: facebook/flashcache [email protected]
Cc: Mohan Srinivasan [email protected]
Sent: Monday, January 5, 2015 12:41 AM
Subject: Re: [flashcache] Failure creating flashcache with 1 sector blocksize (#198)

Hi @FlorianHeigl @mohans ,
The problem sits in the check of disk_assoc in function flashcache_ctr of flashcache_conf.c, I fix this bug and pull the request.
If we change the block size of cache to 1 sector, seems the life of ssd and the performance of flashcache will drop dramatically, since ssd's erase unit usually is 2k or 4k, and device mapper will break the request to 512 bytes, this will cause large number of requests.
Dont forget to set the associativity to larger number(than 512).

Reply to this email directly or view it on GitHub.

@FlorianHeigl
Copy link
Author

Hi,

not, not running trunk, sorry...
I'll need to wait for the next round of OS updates to apply the patch.

Besides, regarding the reduced performance: as is it cannot drop any lower, getting about 1/8th of the raw disk performance, fronted by a enterprise SLC SSD.
My hope was to get the lowest common denominator, since only part of the incoming IOs was 4K sized, but all multiples of 512byte.
I'm aware it is just a try ;)

@ghost
Copy link

ghost commented Aug 4, 2015

Thank you for reporting this issue and appreciate your patience. We've notified the core team for an update on this issue. We're looking for a response within the next 30 days or the issue may be closed.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants