[Sheepdog] support object recovery - too many open files
Piavlo
piavka at cs.bgu.ac.il
Mon Jan 25 13:33:47 CET 2010
MORITA Kazutaka wrote:
> On 2010/01/25 18:39, Piavlo wrote:
>
>>>> Durin first image creation with
>>>> fire-srv3 ~ # qemu-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa
>>>> sd_claim 1351: zopa
>>>>
>>>> I get "Too many open files" for the collie process on just one of the
>>>> nodes -> fire-srv4:
>>>>
>>>>
>>> There seem to be fd leaks, sorry.
>>> Probably it is because of request forwarding codes.
>>> I'll fix it later.
>>>
>>>
>> Was it fixed with todays merge of next and master branch?
>>
>
> Probably, it is fixed.
> Could you try the current master branch?
>
now it gets stuck on error :
fire-srv3 ~ # qemu-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa
sd_claim 1351: zopa
aio_read_response 855: No object found
....
There is error on fire-srv4 logs only:
Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
3, 3
Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
3, 3
Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
3, 3
Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
3, 3
Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
3, 3
Jan 25 14:26:16 localhost collie: store_queue_request(573) failed, 0, 4,
40144 , 3, 3
and the collie porcess has 244 sockets open (thought not 1024)
lsof -p 7159 | awk '$5 ~ /^sock$/' | wc
214 2354 19688
fire-srv4 ~ #
on other nodes collies do not have any such sockets open
> Thanks,
>
> Kazutaka Morita
>
More information about the sheepdog
mailing list