All about Unsafe Code in C#
C# .net hides most of memory management, which makes it much easier for the
developer. Thanks for the Garbage Collector and the use of references. But to
make the language powerful enough in some cases in which we need direct access
to the memory, unsafe code was invented.
Commonly while programming in the .net framework we don?t need to use unsafe
code, but in some cases there is no way not to, such as the following:
- Real-time applications, we might need to use pointers to enhance
performance in such applications.
- External functions, in non-.net DLLs some functions requires a pointer as
a parameter, such as Windows APIs that were written in C.
- Debugging, sometimes we need to inspect the memory contents for debugging
purposes, or you might need to write an application that analyzes another
application process and memory.
Unsafe code is mostly about pointers which have the following advantages and
disadvantages.
Advantages of Unsafe Code in C#:
- Performance and flexibility, by using pointer you can access data and
manipulate it in the most efficient way possible.
- Compatibility, in most cases we still need to use old windows APIs, which
use pointers extensively. Or third parties may supply DLLs that some of its
functions need pointer parameters. Although this can be done by writing the
DLLImport declaration in a way that avoids pointers, but in some cases it?s
just much simpler to use pointer.
- Memory Addresses, there is no way to know the memory address of some data
without using pointers.
Disadvantages of Unsafe Code in C#:
- Complex syntax, to use pointers you need to go throw more complex syntax
than we used to experience in C#.
- Harder to use, you need be more careful and logical while using pointers,
miss using pointers might lead to the following:
- Overwrite other variables.
- Stack overflow.
- Access areas of memory that doesn?t contain any data as they do.
- Overwrite some information of the code for the .net runtime, which will
surely lead your application to crash.
- Your code will be harder to debug. A simple mistake in using pointers
might lead your application to crash randomly and unpredictably.
- Type-safety, using pointers will cause the code to fail in the .net
type-safety checks, and of course if your security police don?t allow non
type-safety code, then the .net framework will refuse to execute your
application.
After we knew all the risks that might face us while using pointer and
all the advantages those pointers introduces us of performance and flexibility,
let us find now how to use them. The keyword unsafe is used while dealing with
pointer, the name reflects the risks that you might face while using it. Let?s
see where to place it. We can declare a whole class as unsafe:
unsafe class Class1
{
//you can use pointers here!
}
Or only some class members can be declared as unsafe:
class Class1
{
//pointer
unsafe int * ptr;
unsafe
void MyMethod()
{
//you can use pointers here
}
}
The same applies to other members such as the constructor and the properties.
To declare unsafe local variables in a method, you have to put them in unsafe
blocks as the following:
static void Main()
{
//can't use pointers here
unsafe
{
//you can declare and use pointer here
}
//can't use pointers here
}
You can?t declare local pointers in a ?safe? method in the same way we used
in declaring global pointers, we have to put them in an unsafe block.
static void Main()
{
unsafe int * ptri; //Wrong
}
If you got too excited and tried to use unsafe then when you compile the code
just by using
csc test.cs
You will experience the following error:
error CS0227: Unsafe code may only appear if compiling with /unsafe
For compiling unsafe code use the /unsafe
csc test.cs /unsafe
In VS.net go to the project property page and in ?configuration
properties>build? set Allow Unsafe Code Blocks to True.
After we knew how to declare a block as unsafe we should now learn how to
declare and use pointers in it.
Declaring pointers
To declare a pointer of any type all what you have to do is to put ?*? after
the type name such as
int * ptri;
double * ptrd;
NOTE: If you used to use pointer in C or C++ then be careful that in C# int *
ptri, i; ?*? applies to the type itself not the variable so ?i? is a pointer
here as well, same as arrays.
void Pointers
If you want to declare a pointer, but you do not wish to specify a type for
it, you can declare it as void.
void *ptrVoid;
The main use of this is if you need to call an API function than require
void* parameters. Within the C# language, there isn?t a great deal that you can
do using void pointers.
Using pointers
Using pointers can be demonstrated in the following example:
static void Main()
{
int var1 = 5;
unsafe
{
int * ptr1, ptr2;
ptr1 = &var1;
ptr2 = ptr1;
*ptr2 =
20;
}
Console.WriteLine(var1);
}
The operator ?&? means ?address of?, ptr1 will hold the address of var1,
ptr2 = ptr1 will assign the address of var1, which ptr1 was holding, to ptr2.
Using ?*? before the pointer name means ?the content of the address?, so 20 will
be written where ptr2 points.
Now var1 value is 20.
sizeof operator
As the name says, sizeof operator will return the
number of bytes occupied of the given data type
unsafe
{
Console.WriteLine("sbyte: {0}", sizeof(sbyte));
Console.WriteLine("byte: {0}", sizeof(byte));
Console.WriteLine("short:
{0}", sizeof(short));
Console.WriteLine("ushort: {0}", sizeof(ushort));
Console.WriteLine("int: {0}", sizeof(int));
Console.WriteLine("uint:
{0}", sizeof(uint));
Console.WriteLine("long: {0}", sizeof(long));
Console.WriteLine("ulong: {0}", sizeof(ulong));
Console.WriteLine("char:
{0}", sizeof(char));
Console.WriteLine("float: {0}", sizeof(float));
Console.WriteLine("double: {0}", sizeof(double));
Console.WriteLine("decimal: {0}", sizeof(decimal));
Console.WriteLine("bool: {0}", sizeof(bool));
//did I miss
something?!
}
The output will be:
sbyte: 1
byte: 1
short: 2
ushort: 2
int: 4
uint: 4
long: 8
ulong: 8
char: 2
float: 4
double: 8
decimal: 16
bool: 1
Great, we don?t have to remember the size of every data type anymore!
Casting Pointers
A pointer actually stores an integer that represents a memory address, and
it?s not surprising to know that you can explicitly convert any pointer to or
from any integer type. The following code is totally legal.
int x = 10;
int *px;
px = &x;
uint y = (uint)
px;
int *py = (int*) y;
A good reason for casting pointers to integer types is in order to display
them. Console.Write() and Console.WriteLine() methods do not have any overloads
to take pointers. Casting a pointer to an integer type will solve the problem.
Console.WriteLine(?The Address is: ? + (uint) px);
As I mentioned before, it?s totally legal to cast a pointer to any integer
type. But does that really mean that we can use any integer type for casting,
what about overflows? On a 32-bit machine we can use uint, long and ulong where
an address runs from zero to about 4 billion. And on a 64-bit machine we can
only use ulong. Note that casting the pointer to other integer types is very
likely to cause and overflow error. The real problem is that checked keyword
doesn?t apply to conversions involving pointers. For such conversions,
exceptions wont be raised when an overflow occur, even in a checked context.
When you are using pointers the .net framework will assume that you know what
you?re doing and you?ll be happy with the overflows!
You can explicitly convert between pointers pointing to different types. For
example:
byte aByte = 8;
byte *pByte = &aByte;
double *pDouble
= (double*) pByte;
This is perfectly legal code, but think twice if you are trying something
like that. In the above example, the double value pointed to by pDouble will
actually contain a byte (which is 8), combined by an area of memory contained a
double, which surely won?t give a meaningful value. However, you might want to
convert between types in order to implement a union, or you might want to cast
pointers to other types into pointers to sbyte in order to examine individual
bytes of memory.
Pointers Arithmetic
It?s possible to use the operators +, -, +=, -=, ++ and -- with pointers,
with a long or ulong on the right-hand side of the operator. While it?s not
permitted to do any operation on a void pointer.
For example, suppose you have a pointer to an int, and you want to add 1 to
it. The compiler will assume that you want to access the following int in the
memory, and so will actually increase the value by 4 bytes, the size of int. If
the pointer was pointing to a double, adding 1 will increase its value by 8
bytes the size of a double.
The general rule is that adding a number X to a pointer to type T with a
value P gives the result P + X *sizeof(T).
Let?s have a look at the following example:
uint u = 3;
byte b = 8;
double d = 12.5;
uint *pU =
&u;
byte *pB = &b;
double *pD = &d;
Console.WriteLine("Before Operations");
Console.WriteLine("Value of
pU:" + (uint) pU);
Console.WriteLine("Value of pB:" + (uint) pB);
onsole.WriteLine("Value of pD:" + (uint) pD);
pU += 5;
pB -=
3;
pD++;
Console.WriteLine("\nAfter Operations");
Console.WriteLine("Value of pU:" + (uint) pU);
Console.WriteLine("Value
of pB:" + (uint) pB);
Console.WriteLine("Value of pD:" + (uint) pD);
The result is:
Before Operations
Value of pU:1242784
Value of pB:1242780
Value of pD:1242772
After Operations
Value of pU:1242804
Value of pB:1242777
Value of pD:1242780
5 * 4 = 20, where added to pU.
3 * 1 = 3, where subtracted from pB.
1 * 8 = 8, where added to pD.
We can also subtract one pointer from another pointer, provided both pointers
point to the same date type. This will result a long whose value is given by the
difference between the pointers values divided by the size of the type that they
represent:
double *pD1 = (double*) 12345632;
double *pD2 = (double*)
12345600;
long L = pD1 ? pD2; //gives 4 =32/8(sizeof(double))
Note that the way of initializing pointers in the example is totally valid.
Pointers to Structs and Class members
Pointers can point to structs the same way we used before as long as they
don?t contain any reference types. The compiler will result an error if you had
any pointer pointing to a struct containing a reference type.
Let?s have an example,
Suppose we had the following struct:
struct MyStruct
{
public long X;
public double D;
}
Declaring a pointer to it will be:
MyStruct *pMyStruct;
Initializing it:
MyStruct myStruct = new MyStruct();
pMyStruct = &
myStruct;
To access the members:
(*pMyStruct).X = 18;
(*pMyStruct).D = 163.26;
The syntax is a bit complex, isn?t it?
That?s why C# defines another operator that allows us to access members of
structs through pointers with a simpler syntax. The operator ?Pointer member
access operator? looks like an arrow, it?s a dash followed by a greater than
sign: ->
pMyStruct->X = 18;
pMyStruct->D = 163.26;
That looks better!
Fields within the struct can also be directly accessed through pointer of
their type:
long *pL = &(myStruct.X);
double *pD = &(myStruct.D);
Classes and pointers is a different story. We already know that we can?t have
a pointer pointing to a class, where it?s a reference type for sure. The Garbage
Collector doesn?t keep any information about pointers, it?s only interested in
references, so creating pointers to classes could cause the Garbage Collector to
not work probably.
On the other hand, class members could be value types, and it?s possible to
create pointers to them. But this requires a special syntax. Remember that class
members are embedded in a class, which sets in the heap. That means that they
are still under the control of the Garbage Collector, which can at any time
decide to move the class instance to a new location. The Garbage Collector knows
about the reference, and will update its value, but again it?s not interested in
the pointers around, and they will still be pointing to the old location.
To avoid the risk of this problem, the compiler will result an error if you
tried to create pointers pointing to class members in the same way we are using
up to now.
The way around this problem is by using the keyword ?fixed?. It marks out a
block of code bounded by braces, and notifies the Garbage Collector that
there may be pointers pointing to members of certain class instances, which must
not be moved.
Let?s have an example,
Suppose the following class:
class MyClass
{
public long X;
public double D;
}
Declaring pointers to its members in the regular way is a compile-time error:
MyClass myClass = new MyClass();
long *pX = &(myClass.X); //compile-time error.
To create pointers pointing to class members by using fixed keyword:
fixed (long *pX = &(myClass.X))
{
// use *pX here
only.
}
The variable *pX is scoped within the fixed block only, and tells the garbage
collector that not to move ?myClass? while the code inside the fixed block.
stackalloc
The keyword "stackalloc" commands the .net runtime to allocate a certain
amount of memory on the stack. It requires two things to do so, the type (value
types only) and the number of variables you?re allocating the stack for. For
example if you want to allocate enough memory to store 4 floats, you can write
the following:
float *ptrFloat = stackalloc float [4];
Or to allocate enough memory to store 50 shorts:
short *ptrShort = stackalloc short [50];
stackalloc simply allocates memory, it doesn?t initialize it to any value.
The advantage of stackalloc is the ultra-high performance, and it?s up to you to
initialize the memory locations that were allocated.
A very useful place of stackalloc could be creating an array directly in the
stack. While C# had made using arrays very simple and easy, it still suffers
from the disadvantage that these arrays are actually objects instantiated from
System.Array and they are stored on the heap with all of the overhead that
involves.
To create an array in the stack:
int size;
size = 6; //we can get this value at run-time as
well.
int *int_ary = stackalloc int [size];
To access the array members, it?s very obvious to use *(int_ary + i), where
?i ?is the index. But it won?t be surprising to know that it?s also possible to
use int_ary[i].
*( int_ary + 0) = 5; //or *int_ary = 5;
*( int_ary + 1) = 9;
//accessing member #1
*( int_ary + 2) = 16;
int_ary[3] = 19;
//another way to access members
int_ary[4] = 7;
int_ary[5] = 10;
In a usual array, accessing a member outside the array bounds will cause an
exception. But when using stackalloc, you?re simply accessing an address
somewhere on the stack; writing on it could cause to corrupt a variable value,
or worst, a return address from a method currently being executed.
int[] ary = new int[6];
ary[10] = 5;//exception thrown
int *ary = stackalloc int [6];
ary[10] = 5;// the address (ary
+ 10 * sizeof(int)) had 5 assigned to it.
This takes us to the beginning to the article; using pointer comes with a
cost. You have to be very certain of what you?re doing, any small error could
cause very strange and hard to debug run-time bugs.
Conclusion
Microsoft had chosen the term ?unsafe? to warn programmers of the risk they
will go throw after typing that word. Using pointers as we had seen in this
article has a lot of advantages from flexibility to high performance, but a very
small error, even a simple typing mistake, might cause your entire application
to crash. Worst, this could happen randomly and unpredictably, and will make
debugging a very harder task to do.
- Article by Mardawi