[Sheepdog] support object recovery - too many open files
Piavlo
piavka at cs.bgu.ac.il
Sun Jan 31 07:21:29 CET 2010
Hi,
Any progress with this issue?
Thanks
Alex
Piavlo wrote:
> MORITA Kazutaka wrote:
>
>> On 2010/01/25 18:39, Piavlo wrote:
>>
>>
>>>>> Durin first image creation with
>>>>> fire-srv3 ~ # qemu-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa
>>>>> sd_claim 1351: zopa
>>>>>
>>>>> I get "Too many open files" for the collie process on just one of the
>>>>> nodes -> fire-srv4:
>>>>>
>>>>>
>>>>>
>>>> There seem to be fd leaks, sorry.
>>>> Probably it is because of request forwarding codes.
>>>> I'll fix it later.
>>>>
>>>>
>>>>
>>> Was it fixed with todays merge of next and master branch?
>>>
>>>
>> Probably, it is fixed.
>> Could you try the current master branch?
>>
>>
>
> now it gets stuck on error :
>
> fire-srv3 ~ # qemu-img convert -f raw -O sheepdog /dev/sys/kvm-img zopa
> sd_claim 1351: zopa
> aio_read_response 855: No object found
> ....
>
> There is error on fire-srv4 logs only:
> Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
> 3, 3
> Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
> 3, 3
> Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
> 3, 3
> Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
> 3, 3
> Jan 25 14:26:16 localhost collie: store_queue_request(540) 0, 4, 40144 ,
> 3, 3
> Jan 25 14:26:16 localhost collie: store_queue_request(573) failed, 0, 4,
> 40144 , 3, 3
>
> and the collie porcess has 244 sockets open (thought not 1024)
> lsof -p 7159 | awk '$5 ~ /^sock$/' | wc
> 214 2354 19688
> fire-srv4 ~ #
>
> on other nodes collies do not have any such sockets open
>
>
>> Thanks,
>>
>> Kazutaka Morita
>>
>>
>
>
>
More information about the sheepdog
mailing list