When I try to read a corrupted .xls file (with a part of file accidentally lost), my application hangs in an infinite loop, while I expect it to fail with some exception. As I use the library in a web application, such behavior can slow down the server greatly. It also creates a vulnerability to DDOS attacks. An example corrupted file is attached.
This issue can only be reproduced in "Any CPU" or "x64" build mode, as in "x86" OutOfMemory exception is thrown.
__Issue reason:__
I have discovered that the apllication loops in a "do while" cycle in XlsStream.ReadStream. The fat.GetNextSector never returns FATMARKERS.FAT_EndOfChain, so the loop never stops.
__Fix proposal:__
Infinite loop can easily be avoided by analyzing the return value of m_fileStream.Read call (the number of bytes read from file). When the file is read to end, m_fileStream.Read returns zero, and it means that we can exit the loop. So I added additional check to the "while" statement.
With such fix, an attempt to read a corrupted file results in an "ArgumentOutOfRange" exception thrown somewhere further in code. This is way better than a hang; however, the thrown exception is not informative. I guess we can throw an exception about file corruption explicitly when we get "zero" from m_fileStream.Read, but I'm not sure that in normal operation such situation can never happen (I just don't know the ExcelDataReader inner structure well enough). I have attached both variants of the fix.
Can the fix be added to the trunk version? Should I upload the patch in the "Source Code" section?
Thanks!
Comments: migrated to github
This issue can only be reproduced in "Any CPU" or "x64" build mode, as in "x86" OutOfMemory exception is thrown.
__Issue reason:__
I have discovered that the apllication loops in a "do while" cycle in XlsStream.ReadStream. The fat.GetNextSector never returns FATMARKERS.FAT_EndOfChain, so the loop never stops.
__Fix proposal:__
Infinite loop can easily be avoided by analyzing the return value of m_fileStream.Read call (the number of bytes read from file). When the file is read to end, m_fileStream.Read returns zero, and it means that we can exit the loop. So I added additional check to the "while" statement.
With such fix, an attempt to read a corrupted file results in an "ArgumentOutOfRange" exception thrown somewhere further in code. This is way better than a hang; however, the thrown exception is not informative. I guess we can throw an exception about file corruption explicitly when we get "zero" from m_fileStream.Read, but I'm not sure that in normal operation such situation can never happen (I just don't know the ExcelDataReader inner structure well enough). I have attached both variants of the fix.
Can the fix be added to the trunk version? Should I upload the patch in the "Source Code" section?
Thanks!
Comments: migrated to github