View Full Version : Pointer arithmetic in C#
aggle-rithm
23rd October 2006, 09:46 AM
I am working on a file parsing system in C#, and I want to share an array of bytes between objects, with each object only looking at a subset of the array. There is some overlap, as it breaks the file down into hierarchical structures, some of which contain subsets of the data within "child" objects, as well as the complete set.
In C++, this was fairly simple to do, since all I had to do was add the offset to the original pointer, and there was my new byte array. In C# it's more difficult because the array is an encapsulated object.
I've already started to work with one solution that wraps the array object in a class that allows subsets to be easily referred to, but I'm wondering if I'm missing an easier solution.
Stimpson J. Cat
23rd October 2006, 12:45 PM
I'm not so familiar with C#, but I think you should be able to use iterators to your array class in pretty much the same way you would use pointers in C++.
Dr. Stupid
JamesM
23rd October 2006, 01:32 PM
So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?
Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.
a_unique_person
23rd October 2006, 06:00 PM
In Memory tables are great. You can use SQL queries to look at the subset you want.
aggle-rithm
24th October 2006, 11:37 AM
So you want to be able to pass 'slices' of the array around, such that any modification to the slice also affects the whole array?
Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.
The original purpose of this system, although I might add to it in the future, is to go into an Excel file and grab just the worksheet names. Excel files have a logical format (BIFF) built onto a physical format (compound document file format), so that the logical slices can be subsets of the physical slices. For the sake of data integrity, I really didn't want to split the same data up into multiple buffers. However, memory is cheap, so I may end up taking that approach after all.
Thanks for everyone's input.
xenxabar
24th October 2006, 08:53 PM
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:
1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.
Wowbagger
24th October 2006, 09:19 PM
Don't fight the object-based approach. That will be the 'easier' solution in C#. It's easier than faking pointer arithmetic, at any rate.I would agree, here. You don't need to use pointers in C#, except in the most special of cases. (I wish I could remember some good examples of such cases.)
In Memory tables are great. You can use SQL queries to look at the subset you want.Yeah, but performance is a drag. Cost/Benefit is not worthwhile vs. arrays of files and stuff, for this purpose.
That is an idea worth considering if you have lots of data, and are performing relatively complex analysis of it- you can let SQL do most of the hard work.
Grimoire
25th October 2006, 01:19 AM
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:
1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
I agree 100%. I've used this technique before, and it works quite well, even if the documentation can be somewhat hard to find...
xenxabar
25th October 2006, 07:17 AM
I agree 100%. I've used this technique before, and it works quite well, even if the documentation can be somewhat hard to find...
Here's an article for how to open Excel in C#:
http://www.codeproject.com/csharp/csharp_excel.asp
SpeederA
25th October 2006, 09:57 AM
Use interfaces and polymorphism
aggle-rithm
25th October 2006, 12:17 PM
Unless performance is critical, you can easily get names of the Worksheets in a Workbook by doing one of the following:
1. Add a reference to the Excel object library in C# and simply loop through each worksheet in each file (workbook) that you want.
2. Do basically the same as in option 1 but use VBA in Excel instead.
That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.
Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.
Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.
SpeederA
25th October 2006, 12:54 PM
Just create an "ExcelFileInfo" class which stores the byte array and then have nested classes (or structures if the info contains no reference types and will not be boxed) which deal with the breakdown of the array into smaller units.
That, or one of the billion other reasonable ways of compartmentalizing the array. Just do it in a way that completely hides the inner data structure from the rest of the proggy.
You're life will be much easier in the long run.
If you feel that you will be instantiating and destroying too many objects because of the numbers of calls, you can always use resurrection and an object pool to lower the load on the garbage collector.
Do NOT access excel worksheets from a multi-threaded application using COM as, not only are excel objects designed to be run in STA mode, they require specific user rights and the ability to interact with the desktop. Microsoft does not support this in any way, shape, or form in the current versions of office..... =o/
a_unique_person
25th October 2006, 07:22 PM
That brings me to my original reason for doing all this: Excel, used as an object, is extremely unstable in multi-threaded applications. Even if it's a multi-threaded application in which the instance of Excel is used ONLY IN THE MAIN THREAD, it will crash consistently.
Unfortunately, we have engineers where I work that insist on using Excel as a database system, even though they have Access and SQL Server. The worksheets are enormously complex and there are THOUSANDS of them that have to be accessed each day. Even in a single-threaded application, there are numerous times when the Excel object locks up and another needs to be started. We end up with fifty instances of Excel running in the background.
Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.
Grimoire
25th October 2006, 07:49 PM
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.
I'm totally going to steal that line...
I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".
Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...
Rob Lister
25th October 2006, 07:57 PM
I'm totally going to steal that line...
I am going to have to disagree with you though. Of course they can do something stupid and expect someone else to unstupid it. They shouldn't, but they can, and frequently will. As a developer, I feel it is my job to smack them upside the head, and then give them a better solution. People won't respond to arguments the are basically "but you are doing it wrong".
Present to them why something is wrong, the problems found when doing it wrong, what should be done to correct it, how it can be implemented, and what the advantages are to doing it correctly. If they see the light and decide to fix it, perfect. If not, start looking for another job, because a company that willfully continues to do something the wrong way will only end up costing themselves time and money, and stress the hell out of you...
And I intend to steal the above two paragraphs from you!
SpeederA
25th October 2006, 08:13 PM
Using Excel as a database is the work of the devil. They can't do something stupid, and expect you to unstupid it.
Apparently you've never been a consultant. :D
Grimoire
25th October 2006, 08:28 PM
And I intend to steal the above two paragraphs from you!
Its ok, I've released them under GPL...
:D
69dodge
25th October 2006, 09:19 PM
Accessing them through OLEDB is much more stable, but you have to know the sheet names, and we can't count on people always observing the naming conventions. Hence, the sheet-name-reading code.http://www.codeproject.com/aspnet/getsheetnames.asp ?
aggle-rithm
26th October 2006, 12:45 PM
http://www.codeproject.com/aspnet/getsheetnames.asp ?
This was just what a needed a week ago.
Thanks for shutting that barn door, but the horse is history...
aerosolben
27th October 2006, 03:02 AM
I would agree, here. You don't need to use pointers in C#, except in the most special of cases. (I wish I could remember some good examples of such cases.)
COM.
You're welcome. :)
© 2001-2009, James Randi Educational Foundation. All Rights Reserved.
vBulletin® v3.7.7, Copyright ©2000-2012, Jelsoft Enterprises Ltd.