Why doesn't GCC optimize a*a*a*a*a*a to (a*a*a)*(a*a*a)?

I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optimize the call pow(a,2) by compiling it into a*a, but the call pow(a,6) is not optimized and will actually call the library function pow, which greatly slows down the performance. (In contrast, Intel C++ Compiler, executable icc, will eliminate the library call for pow(a,6).)

What I am curious about is that when I replaced pow(a,6) with a*a*a*a*a*a using GCC 4.5.1 and options "-O3 -lm -funroll-loops -msse4", it uses 5 mulsd instructions:

movapd  %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm14, %xmm13

while if I write (a*a*a)*(a*a*a), it will produce

movapd  %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm14, %xmm13
mulsd   %xmm13, %xmm13

which reduces the number of multiply instructions to 3. icc has similar behavior.

Why do compilers not recognize this optimization trick?

Error "undefined reference to 'std::cout'"

Shall this be the example:

#include <iostream>
using namespace std;

int main()
    cout << "Hola, moondo.\n";

It throws the error:

gcc -c main.cpp gcc -o edit main.o  main.o: In function `main':
main.cpp:(.text+0xa): undefined reference to `std::cout'
main.cpp:(.text+0xf): undefined reference to `std::basic_ostream<char,std::char_traits<char> >& std::operator<< <std::char_traits<char>>(std::basic_ostream<char, std::char_traits<char> >&, char const*)'
main.o: In function `__static_initialization_and_destruction_0(int,int)':
main.cpp:(.text+0x3d): undefined reference to `std::ios_base::Init::Init()'
main.cpp:(.text+0x4c): undefined reference to `std::ios_base::Init::~Init()' collect2: error: ld
returned 1 exit status make: *** [qs] Error 1

Also, this example:

#include <iostream>

int main()
    std::cout << "Hola, moondo.\n";

throws the error:

gcc -c main.cpp gcc -o edit main.o  main.o: In function `main':
main.cpp:(.text+0xa): undefined reference to `std::cout'
main.cpp:(.text+0xf): undefined reference to `std::basic_ostream<char,std::char_traits<char> >& std::operator<<<std::char_traits<char>>(std::basic_ostream<char,std::char_traits<char> >&, char const*)'
main.o: In function `__static_initialization_and_destruction_0(int,int)': main.cpp:(.text+0x3d): undefined reference to `std::ios_base::Init::Init()'
main.cpp:(.text+0x4c): undefined reference to `std::ios_base::Init::~Init()' collect2: error: ld
returned 1 exit status make: *** [qs] Error 1

Note: I am using Debian 7 (Wheezy).

Why does the order in which libraries are linked sometimes cause errors in GCC?

Why does the order in which libraries are linked sometimes cause errors in GCC?

Why does GCC generate 15-20% faster code if I optimize for size instead of speed?

I first noticed in 2009 that GCC (at least on my projects and on my machines) have the tendency to generate noticeably faster code if I optimize for size (-Os) instead of speed (-O2 or -O3), and I have been wondering ever since why.

I have managed to create (rather silly) code that shows this surprising behavior and is sufficiently small to be posted here.

const int LOOP_BOUND = 200000000;

static int add(const int& x, const int& y) {
    return x + y;

static int work(int xval, int yval) {
    int sum(0);
    for (int i=0; i<LOOP_BOUND; ++i) {
        int x(xval+sum);
        int y(yval+sum);
        int z = add(x, y);
        sum += z;
    return sum;

int main(int , char* argv[]) {
    int result = work(*argv[1], *argv[2]);
    return result;

If I compile it with -Os, it takes 0.38 s to execute this program, and 0.44 s if it is compiled with -O2 or -O3. These times are obtained consistently and with practically no noise (gcc 4.7.2, x86_64 GNU/Linux, Intel Core i5-3320M).

(Update: I have moved all assembly code to GitHub: They made the post bloated and apparently add very little value to the questions as the fno-align-* flags have the same effect.)

Here is the generated assembly with -Os and -O2.

Unfortunately, my understanding of assembly is very limited, so I have no idea whether what I did next was correct: I grabbed the assembly for -O2 and merged all its differences into the assembly for -Os except the .p2align lines, result here. This code still runs in 0.38s and the only difference is the .p2align stuff.

If I guess correctly, these are paddings for stack alignment. According to Why does GCC pad functions with NOPs? it is done in the hope that the code will run faster, but apparently this optimization backfired in my case.

Is it the padding that is the culprit in this case? Why and how?

The noise it makes pretty much makes timing micro-optimizations impossible.

How can I make sure that such accidental lucky / unlucky alignments are not interfering when I do micro-optimizations (unrelated to stack alignment) on C or C++ source code?


Following Pascal Cuoq's answer I tinkered a little bit with the alignments. By passing -O2 -fno-align-functions -fno-align-loops to gcc, all .p2align are gone from the assembly and the generated executable runs in 0.38s. According to the gcc documentation:

-Os enables all -O2 optimizations [but] -Os disables the following optimization flags:

  -falign-functions  -falign-jumps  -falign-loops
  -falign-labels  -freorder-blocks  -freorder-blocks-and-partition

So, it pretty much seems like a (mis)alignment issue.

I am still skeptical about -march=native as suggested in Marat Dukhan's answer. I am not convinced that it isn't just interfering with this (mis)alignment issue; it has absolutely no effect on my machine. (Nevertheless, I upvoted his answer.)


We can take -Os out of the picture. The following times are obtained by compiling with

  • -O2 -fno-omit-frame-pointer 0.37s

  • -O2 -fno-align-functions -fno-align-loops 0.37s

  • -S -O2 then manually moving the assembly of add() after work() 0.37s

  • -O2 0.44s

It looks like to me the distance of add() from the call site matters a lot. I have tried perf, but the output of perf stat and perf report makes very little sense to me. However, I could only get one consistent result out of it:


 602,312,864 stalled-cycles-frontend   #    0.00% frontend cycles idle
       3,318 cache-misses
 0.432703993 seconds time elapsed
 81.23%  a.out  a.out              [.] work(int, int)
 18.50%  a.out  a.out              [.] add(int const&, int const&) [clone .isra.0]
       ¦   __attribute__((noinline))
       ¦   static int add(const int& x, const int& y) {
       ¦       return x + y;
100.00 ¦     lea    (%rdi,%rsi,1),%eax
       ¦   }
       ¦   ? retq
       ¦            int z = add(x, y);
  1.93 ¦    ? callq  add(int const&, int const&) [clone .isra.0]
       ¦            sum += z;
 79.79 ¦      add    %eax,%ebx

For fno-align-*:

 604,072,552 stalled-cycles-frontend   #    0.00% frontend cycles idle
       9,508 cache-misses
 0.375681928 seconds time elapsed
 82.58%  a.out  a.out              [.] work(int, int)
 16.83%  a.out  a.out              [.] add(int const&, int const&) [clone .isra.0]
       ¦   __attribute__((noinline))
       ¦   static int add(const int& x, const int& y) {
       ¦       return x + y;
 51.59 ¦     lea    (%rdi,%rsi,1),%eax
       ¦   }
       ¦    __attribute__((noinline))
       ¦    static int work(int xval, int yval) {
       ¦        int sum(0);
       ¦        for (int i=0; i<LOOP_BOUND; ++i) {
       ¦            int x(xval+sum);
  8.20 ¦      lea    0x0(%r13,%rbx,1),%edi
       ¦            int y(yval+sum);
       ¦            int z = add(x, y);
 35.34 ¦    ? callq  add(int const&, int const&) [clone .isra.0]
       ¦            sum += z;
 39.48 ¦      add    %eax,%ebx
       ¦    }

For -fno-omit-frame-pointer:

 404,625,639 stalled-cycles-frontend   #    0.00% frontend cycles idle
      10,514 cache-misses
 0.375445137 seconds time elapsed
 75.35%  a.out  a.out              [.] add(int const&, int const&) [clone .isra.0]                                                                                     ¦
 24.46%  a.out  a.out              [.] work(int, int)
       ¦   __attribute__((noinline))
       ¦   static int add(const int& x, const int& y) {
 18.67 ¦     push   %rbp
       ¦       return x + y;
 18.49 ¦     lea    (%rdi,%rsi,1),%eax
       ¦   const int LOOP_BOUND = 200000000;
       ¦   __attribute__((noinline))
       ¦   static int add(const int& x, const int& y) {
       ¦     mov    %rsp,%rbp
       ¦       return x + y;
       ¦   }
 12.71 ¦     pop    %rbp
       ¦   ? retq
       ¦            int z = add(x, y);
       ¦    ? callq  add(int const&, int const&) [clone .isra.0]
       ¦            sum += z;
 29.83 ¦      add    %eax,%ebx

It looks like we are stalling on the call to add() in the slow case.

I have examined everything that perf -e can spit out on my machine; not just the stats that are given above.

For the same executable, the stalled-cycles-frontend shows linear correlation with the execution time; I did not notice anything else that would correlate so clearly. (Comparing stalled-cycles-frontend for different executables doesn't make sense to me.)

I included the cache misses as it came up as the first comment. I examined all the cache misses that can be measured on my machine by perf, not just the ones given above. The cache misses are very very noisy and show little to no correlation with the execution times.

How can I add a default include path for GCC in Linux?

I'd like GCC to include files from $HOME/include in addition to the usual include directories, but there doesn't seem to be an analogue to $LD_LIBRARY_PATH.

I know I can just add the include directory at command line when compiling (or in the makefile), but I'd really like a universal approach here, as in the library case.

Removing trailing newline character from fgets() input

I am trying to get some data from the user and send it to another function in gcc. The code is something like this.

printf("Enter your Name: ");
if (!(fgets(Name, sizeof Name, stdin) != NULL)) {
    fprintf(stderr, "Error reading Name.\n");

However, I find that it has a newline \n character in the end. So if I enter John it ends up sending John\n. How do I remove that \n and send a proper string.

Why is my program slow when looping over exactly 8192 elements?

Here is the extract from the program in question. The matrix img[][] has the size SIZE×SIZE, and is initialized at:

img[j][i] = 2 * j + i

Then, you make a matrix res[][], and each field in here is made to be the average of the 9 fields around it in the img matrix. The border is left at 0 for simplicity.

    for(j=1;j<SIZE-1;j++) {
                res[j][i] += img[j+l][i+k];
        res[j][i] /= 9;

That's all there's to the program. For completeness' sake, here is what comes before. No code comes after. As you can see, it's just initialization.

#define SIZE 8192
float img[SIZE][SIZE]; // input image
float res[SIZE][SIZE]; //result of mean filter
int i,j,k,l;
        img[j][i] = (2*j+i)%8196;

Basically, this program is slow when SIZE is a multiple of 2048, e.g. the execution times:

SIZE = 8191: 3.44 secs
SIZE = 8192: 7.20 secs
SIZE = 8193: 3.18 secs

The compiler is GCC. From what I know, this is because of memory management, but I don't really know too much about that subject, which is why I'm asking here.

Also how to fix this would be nice, but if someone could explain these execution times I'd already be happy enough.

I already know of malloc/free, but the problem is not amount of memory used, it's merely execution time, so I don't know how that would help.

This C function should always return false, but it doesn’t

I stumbled over an interesting question in a forum a long time ago and I want to know the answer.

Consider the following C function:


#include <stdbool.h>

bool f1()
    int var1 = 1000;
    int var2 = 2000;
    int var3 = var1 + var2;
    return (var3 == 0) ? true : false;

This should always return false since var3 == 3000. The main function looks like this:


#include <stdio.h>
#include <stdbool.h>

int main()
    printf( f1() == true ? "true\n" : "false\n");
    if( f1() )
    return 0;

Since f1() should always return false, one would expect the program to print only one false to the screen. But after compiling and running it, executed is also displayed:

$ gcc main.c f1.c -o test
$ ./test

Why is that? Does this code have some sort of undefined behavior?

Note: I compiled it with gcc (Ubuntu 4.9.2-10ubuntu13) 4.9.2.

fatal error: Python.h: No such file or directory

I am trying to build a shared library using a C extension file but first I have to generate the output file using the command below:

gcc -Wall utilsmodule.c -o Utilc

After executing the command, I get this error message:

> utilsmodule.c:1:20: fatal error: Python.h: No such file or directory
compilation terminated.

I have tried all the suggested solutions over the internet but the problem still exists. I have no problem with Python.h. I managed to locate the file on my machine.

How do the likely/unlikely macros in the Linux kernel work and what is their benefit?

I've been digging through some parts of the Linux kernel, and found calls like this:

if (unlikely(fd < 0))
    /* Do something */


if (likely(!err))
    /* Do something */

I've found the definition of them:

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

I know that they are for optimization, but how do they work? And how much performance/size decrease can be expected from using them? And is it worth the hassle (and losing the portability probably) at least in bottleneck code (in userspace, of course).

How do I best silence a warning about unused variables?

I have a cross platform application and in a few of my functions not all the values passed to functions are utilised. Hence I get a warning from GCC telling me that there are unused variables.

What would be the best way of coding around the warning?

An #ifdef around the function?

#ifdef _MSC_VER
void ProcessOps::sendToExternalApp(QString sAppName, QString sImagePath, qreal qrLeft, qreal qrTop, qreal qrWidth, qreal qrHeight)
void ProcessOps::sendToExternalApp(QString sAppName, QString sImagePath, qreal /*qrLeft*/, qreal /*qrTop*/, qreal /*qrWidth*/, qreal /*qrHeight*/)

This is so ugly but seems like the way the compiler would prefer.

Or do I assign zero to the variable at the end of the function? (which I hate because it's altering something in the program flow to silence a compiler warning).

Is there a correct way?

Why does the C preprocessor interpret the word "linux" as the constant "1"?

Why does the C preprocessor in GCC interpret the word linux (small letters) as the constant 1?


#include <stdio.h>
int main(void)
    int linux = 5;
    return 0;

Result of $ gcc -E test.c (stop after the preprocessing stage):

int main(void)
    int 1 = 5;
    return 0;

Which of course yields an error.

(BTW: There is no #define linux in the stdio.h file.)

What exactly is LLVM?

I keep hearing about LLVM all the time. It's in Perl, then it's in Haskell, then someone uses it in some other language? What is it?

  • What exactly distinguishes it from GCC (perspectives = safety etc.)?
gcc makefile error: "No rule to make target ..."

I'm trying to use GCC (linux) with a makefile to compile my project.

I get the following error which is can't seem to decipher in this context:

"No rule to make target 'vertex.cpp', needed by 'vertex.o'.  Stop."

This is the makefile:

a.out: vertex.o edge.o elist.o main.o vlist.o enode.o vnode.o
    g++ vertex.o edge.o elist.o main.o vlist.o enode.o vnode.o

main.o: main.cpp main.h
    g++ -c main.cpp

vertex.o: vertex.cpp vertex.h
    g++ -c vertex.cpp

edge.o: edge.cpp edge.h
    g++ -c num.cpp

vlist.o: vlist.cpp vlist.h
    g++ -c vlist.cpp

elist.o: elist.cpp elist.h
    g++ -c elist.cpp

vnode.o: vnode.cpp vnode.h
    g++ -c vnode.cpp

enode.o: enode.cpp enode.h
    g++ -c node.cpp
How do I list the symbols in a .so file

How do I list the symbols being exported from a .so file? If possible, I'd also like to know their source (e.g. if they are pulled in from a static library).

I'm using gcc 4.0.2, if that makes a difference.

Debug vs Release in CMake

In a GCC compiled project,

  • How do I run CMake for each target type (debug/release)?
  • How do I specify debug and release C/C++ flags using CMake?
  • How do I express that the main executable will be compiled with g++ and one nested library with gcc?
How do you get assembler output from C/C++ source in GCC?

How does one do this?

If I want to analyze how something is getting compiled, how would I get the emitted assembly code?

How to get rid of `deprecated conversion from string constant to ‘char*’` warnings in GCC

I'm working on an exceedingly large codebase, and recently upgraded to GCC 4.3, which now triggers this warning:

warning: deprecated conversion from string constant to ‘char*’

Obviously, the correct way to fix this is to find every declaration like

char *s = "constant string";

or function call like:

void foo(char *s);
foo("constant string");

and make them const char pointers. However, that would mean touching 564 files, minimum, which is not a task I wish to perform at this point in time. The problem right now is that I'm running with -Werror, so I need some way to stifle these warnings. How can I do that?

How exactly does __attribute__((constructor)) work?

It seems pretty clear that it is supposed to set things up.

  1. When exactly does it run?
  2. Why are there two parentheses?
  3. Is __attribute__ a function? A macro? Syntax?
  4. Does this work in C? C++?
  5. Does the function it works with need to be static?
  6. When does __attribute__((destructor)) run?

Example in Objective-C:

static void initialize_navigationBarImages() {
  navigationBarImages = [[NSMutableDictionary alloc] init];

static void destroy_navigationBarImages() {
  [navigationBarImages release];
How do I force make/GCC to show me the commands?

I'm trying to debug a compilation problem, but I cannot seem to get GCC (or maybe it is make??) to show me the actual compiler and linker commands it is executing.

Here is the output I am seeing:

  CCLD   libvirt_parthelper
libvirt_parthelper-parthelper.o: In function `main':
/root/qemu-build/libvirt-0.9.0/src/storage/parthelper.c:102: undefined reference to `ped_device_get'
/root/qemu-build/libvirt-0.9.0/src/storage/parthelper.c:116: undefined reference to `ped_disk_new'
/root/qemu-build/libvirt-0.9.0/src/storage/parthelper.c:122: undefined reference to `ped_disk_next_partition'
/root/qemu-build/libvirt-0.9.0/src/storage/parthelper.c:172: undefined reference to `ped_disk_next_partition'
/root/qemu-build/libvirt-0.9.0/src/storage/parthelper.c:172: undefined reference to `ped_disk_next_partition'
collect2: ld returned 1 exit status
make[3]: *** [libvirt_parthelper] Error 1

What I want to see should be similar to this:

$ make
gcc -Wall   -c -o main.o main.c
gcc -Wall   -c -o hello_fn.o hello_fn.c
gcc   main.o hello_fn.o   -o main

Notice how this example has the complete gcc command displayed. The above example merely shows things like "CCLD libvirt_parthelper". I'm not sure how to control this behavior.

setup script exited with error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

When I try to install odoo-server, I got the following error:

error: Setup script exited with error: command 'x86_64-linux-gnu-gcc' failed with exit status 1

Could anyone help me to solve this issue?

Inheriting constructors

Why does this code:

class A
        explicit A(int x) {}

class B: public A

int main(void)
    B *b = new B(5);
    delete b;

Result in these errors:

main.cpp: In function ‘int main()’:
main.cpp:13: error: no matching function for call to ‘B::B(int)’
main.cpp:8: note: candidates are: B::B()
main.cpp:8: note:                 B::B(const B&)

Shouldn't B inherit A's constructor?

(this is using gcc)

Compiling an application for use in highly radioactive environments

We are compiling an embedded C++ application that is deployed in a shielded device in an environment bombarded with ionizing radiation. We are using GCC and cross-compiling for ARM. When deployed, our application generates some erroneous data and crashes more often than we would like. The hardware is designed for this environment, and our application has run on this platform for several years.

Are there changes we can make to our code, or compile-time improvements that can be made to identify/correct soft errors and memory-corruption caused by single event upsets? Have any other developers had success in reducing the harmful effects of soft errors on a long-running application?

"Agreeing to the Xcode/iOS license requires admin privileges, please re-run as root via sudo." when using GCC

While attempting to compile my C program, running the following command:

gcc pthread.c -o pthread


Agreeing to the Xcode/iOS license requires admin privileges, please re-run as root via sudo.

and my code does not compile.

Why is this happening and how can I fix this problem?

Convert char to int in C and C++

How do I convert a char to an int in C and C++?

Undefined reference to vtable

When building my C++ program, I'm getting the error message

undefined reference to 'vtable...

What is the cause of this problem? How do I fix it?

It so happens that I'm getting the error for the following code (The class in question is CGameModule.) and I cannot for the life of me understand what the problem is. At first, I thought it was related to forgetting to give a virtual function a body, but as far as I understand, everything is all here. The inheritance chain is a little long, but here is the related source code. I'm not sure what other information I should provide.

Note: The constructor is where this error is happening, it'd seem.

My code:

class CGameModule : public CDasherModule {
  CGameModule(Dasher::CEventHandler *pEventHandler, CSettingsStore *pSettingsStore, CDasherInterfaceBase *pInterface, ModuleID_t iID, const char *szName)
  : CDasherModule(pEventHandler, pSettingsStore, iID, 0, szName)
      g_pLogger->Log("Inside game module constructor");   
      m_pInterface = pInterface; 

  virtual ~CGameModule() {};

  std::string GetTypedTarget();

  std::string GetUntypedTarget();

  bool DecorateView(CDasherView *pView) {
      //g_pLogger->Log("Decorating the view");
      return false;

  void SetDasherModel(CDasherModel *pModel) { m_pModel = pModel; }

  virtual void HandleEvent(Dasher::CEvent *pEvent); 


  CDasherNode *pLastTypedNode;

  CDasherNode *pNextTargetNode;

  std::string m_sTargetString;

  size_t m_stCurrentStringPos;

  CDasherModel *m_pModel;

  CDasherInterfaceBase *m_pInterface;

Inherits from...

class CDasherModule;
typedef std::vector<CDasherModule*>::size_type ModuleID_t;

/// \ingroup Core
/// @{
class CDasherModule : public Dasher::CDasherComponent {
  CDasherModule(Dasher::CEventHandler * pEventHandler, CSettingsStore * pSettingsStore, ModuleID_t iID, int iType, const char *szName);

  virtual ModuleID_t GetID();
  virtual void SetID(ModuleID_t);
  virtual int GetType();
  virtual const char *GetName();

  virtual bool GetSettings(SModuleSettings **pSettings, int *iCount) {
    return false;

  ModuleID_t m_iID;
  int m_iType;
  const char *m_szName;

Which inherits from....

namespace Dasher {
  class CEvent;
  class CEventHandler;
  class CDasherComponent;

/// \ingroup Core
/// @{
class Dasher::CDasherComponent {
  CDasherComponent(Dasher::CEventHandler* pEventHandler, CSettingsStore* pSettingsStore);
  virtual ~CDasherComponent();

  void InsertEvent(Dasher::CEvent * pEvent);
  virtual void HandleEvent(Dasher::CEvent * pEvent) {};

  bool GetBoolParameter(int iParameter) const;
  void SetBoolParameter(int iParameter, bool bValue) const;

  long GetLongParameter(int iParameter) const;
  void SetLongParameter(int iParameter, long lValue) const;

  std::string GetStringParameter(int iParameter) const;
  void        SetStringParameter(int iParameter, const std::string & sValue) const;

  ParameterType   GetParameterType(int iParameter) const;
  std::string     GetParameterName(int iParameter) const;

  Dasher::CEventHandler *m_pEventHandler;
  CSettingsStore *m_pSettingsStore;
/// @}

GCC -fPIC option

I have read about GCC's Options for Code Generation Conventions, but could not understand what "Generate position-independent code (PIC)" does. Please give an example to explain me what does it mean.

What is the difference between g++ and gcc?

What is the difference between g++ and gcc? Which one of them should be used for general c++ development?

I don't understand -Wl,-rpath -Wl,

For convenience I added the relevant manpages below.

My (mis)understanding first: If I need to separate options with ,, that means that the second -Wl is not another option because it comes before , which means it is an argument to the -rpath option.

I don't understand how -rpath can have a -Wl,. argument!

What would make sense in my mind would be this:

-Wl,-rpath .

This should invoke -rpath linker option with the current directory argument.

man gcc:


Pass option as an option to the linker. If option contains commas, it is split into multiple options at the commas. You can use this syntax to pass an argument to the option. For example, -Wl,-Map,output.map passes -Map output.map to the linker. When using the GNU linker, you can also get the same effect with `-Wl,-Map=output.map'.

man ld:


Add a directory to the runtime library search path. This is used when linking an ELF executable with shared objects. All -rpath arguments are concatenated and passed to the runtime linker, which uses them to locate shared objects at runtime. The -rpath option is also used when locating shared objects which are needed by shared objects explicitly included in the link;

How to automatically generate a stacktrace when my program crashes

I am working on Linux with the GCC compiler. When my C++ program crashes I would like it to automatically generate a stacktrace.

My program is being run by many different users and it also runs on Linux, Windows and Macintosh (all versions are compiled using gcc).

I would like my program to be able to generate a stack trace when it crashes and the next time the user runs it, it will ask them if it is ok to send the stack trace to me so I can track down the problem. I can handle the sending the info to me but I don't know how to generate the trace string. Any ideas?