trigraphs
Given that there were once reasons to use digraphs and trigraphs in C and C++, does anyone put them in code being written today? Is there any substantial amount of legacy code still under maintenance that contains them?
(Note: Here, "digraph" does not mean "directed graph." Both digraph and trigraph have multiple meanings, but the intended use here are sequences like ??=
or <:
to stand in for characters like #
and [
)
I have a C++14 project withe the Microsoft compiler in Visual Studio 2019 and I'm trying to understand Digraphs and Trigraphs, so my code is a bit weird:
#include "Trigraphs.h"
void Trigraphs::assert_graphs()
??<
// How does this ever compile ????/
ouch!
??>
Reading about the /Zc:trigraphs switch
Through C++14, trigraphs are supported as in C. The C++17 standard removes trigraphs from the C++ language.
I understand that trigraphs should be supported until C++14 because they were removed in C++17 only. Yet, the above code does not compile with C++14 settings until I add the additional command line switch. I am not a native English speaker, did I get something wrong about the sentence that trigraphs are supported until C++14?
Print ?? and !! in different sequence will show different outputI had found a strange output when I write the following lines in very simple way:
Code:
printf("LOL??!\n");
printf("LOL!!?\n");
Output:
It happens even the code is compiled under both MBCS and UNICODE.
The output varies on the sequence of "?" and "!"...
Any idea?
Are digraphs transformed by a compiler and trigraphs transformed by a preprocessor?I'm trying to understand both trigraphs and digraphs rather than use them.
I've read that post and I understood that:
Is this true?
Meaning of character literals containing trigraphs for non-representable charactersOn a C compiler which uses ASCII as its character set, the value of the character literal '??<'
would be equivalent to that of '{'
, i.e. 0x7B. What would be the value of that literal on a compiler whose character set doesn't have a {
character?
Outside a string literal, a compiler could infer that ??<
is supposed to have the same meaning as an open-brace character is defined to have, even if the compiler character set doesn't have an open-brace character. Indeed, the whole purpose of trigraphs is to allow the use of sequences of representable characters to be used in place of characters that aren't representable. The spec requires that trigraphs even be processed within string literals, however, which has me puzzled. If a compiler's character set includes a {
character, the compiler can allow '{'
to be represented as '??<'
, but the character set includes {
I see no reason a programmer wouldn't simply use that. If the character set doesn't include {
, however, which would seem the only reason for using trigraphs in the first place, what representable character would a compiler be expected to replace ??<
with?
I was going through few interview questions and I came across example as below. I tried the example for simple input/output and also for some logic and it works without any problems.
??=include <stdio.h>
int main(void)
??<
printf("Hello");
// Other code lines here
return 0;
??>
To my surprise, this worked without any compilation issue and output was as required.
What is the significance of '??=', '??<' and '??>' here ?
What does the ??!??! operator do in C?I saw a line of C that looked like this:
!ErrorHasOccured() ??!??! HandleError();
It compiled correctly and seems to run ok. It seems like it's checking if an error has occurred, and if it has, it handles it. But I'm not really sure what it's actually doing or how it's doing it. It does look like the programmer is trying express their feelings about errors.
I have never seen the ??!??!
before in any programming language, and I can't find documentation for it anywhere. (Google doesn't help with search terms like ??!??!
). What does it do and how does the code sample work?
Consider this innocuous C++ program:
#include <iostream>
int main() {
std::cout << "(Is this a trigraph??)" << std::endl;
return 0;
}
When I compile it using g++ version 5.4.0, I get the following diagnostic:
me@my-laptop:~/code/C++$ g++ -c test_trigraph.cpp
test_trigraph.cpp:4:36: warning: trigraph ??) ignored, use -trigraphs to enable [-Wtrigraphs]
std::cout << "(Is this a trigraph??)" << std::endl;
^
The program runs, and its output is as expected:
(Is this a trigraph??)
Why are string literals parsed for trigraphs at all?
Do other compilers do this, too?
Why does GCC emit a warning when using trigraphs, but not when using digraphs?Code:
#include <stdio.h>
int main(void)
{
??< puts("Hello Folks!"); ??>
}
The above program, when compiled with GCC 4.8.1 with -Wall
and -std=c11
, gives the following warning:
source_file.c: In function ‘main’:
source_file.c:8:5: warning: trigraph ??< converted to { [-Wtrigraphs]
??< puts("Hello Folks!"); ??>
^
source_file.c:8:30: warning: trigraph ??> converted to } [-Wtrigraphs]
But when I change the body of main
to:
<% puts("Hello Folks!"); %>
no warnings are thrown.
So, Why does the compiler warn me when using trigraphs, but not when using digraphs?
Curious trigraph sequence thing about ansi CWhat was is it the original reason to use trigraph sequence of some chars to become other chars in ansi C like:
??=define arraycheck(a, b) a??(b??) ??!??! b??(a??)
becomes
#define arraycheck(a, b) a[b] || b[a]
Is there a switch to disable trigraphs with clang?
I've got some (legacy) code that I'm building with clang for the first time. The code is something like:
sprintf(buf, "%s <%s ????>", p1, p2);
Clang gives the following warning (error with -Werror
):
test.c:6:33: error: trigraph converted to '}' character [-Werror,-Wtrigraphs]
sprintf(buf, "%s <%s ????>", p1, p2);
^
Clearly the ??>
is not intended as a trigraph, so I want to disable trigraphs entirely (the source does not intentionally use them anywhere).
I have tried -no-trigraphs
but that's not really an option:
clang: warning: argument unused during compilation: '-no-trigraphs'
I can turn off the trigraphs warning with -Wno-trigraphs
but I don't want the trigraph conversion to actually take place at all.
NOTE: Trigraphs were enabled as an unintended side effect of using -std=c89
.
C++17 removed trigraphs. IBM heavily opposed this (here and here) so there seem to be arguments for both sides of removal/non removal.
But since the decision was made to remove trigraphs, why leave digraphs? I don't see any reasons for keeping digraphs beyond the reasons to keep trigraphs (which apparently didn't weight enough to keep them).
Implementing trigraphs in a C89 compilerI am attempting to write a simplistic C89 --> x86_64 compiler, based on this C89 standard draft, in C89, for learning's sake. So far, I am implementing translation phase 1. My understanding is that this consists of
I have tried to implement this with a program (please forgive any style errors I have made):
char *trigraph_replacement(char *code)
{
char *temp = calloc(1, strlen(code));
char *temp_1 = temp;
char *code_1 = code;
for (; *code_1; code_1++)
{
if (strncmp(code_1, "??", 2) == 0)
{
code_1 += 2;
switch (*code_1)
{
case '<':
*(temp_1) = '{';
break;
case '>':
*(temp_1) = '}';
break;
case '(':
*(temp_1) = '[';
break;
case ')':
*(temp_1) = ']';
break;
case '=':
*(temp_1) = '#';
break;
case '/':
*(temp_1) = '\\';
break;
case '\'':
*(temp_1) = '^';
break;
case '!':
*(temp_1) = '!';
break;
case '-':
*(temp_1) = '~';
break;
default:
break;
}
}
else
{
*temp_1 = *code_1;
}
temp_1++;
}
free(code);
return temp;
}
Now, intuitively, it seems that this should do what is is supposed to do, replace all the trigraphs. However, the gcc docs say that "Trigraphs are not popular and many compilers implement them incorrectly". It goes on to state that "portable code should not rely on trigraphs being either converted or ignored"
As a result, I am wondering
Why when the ON
or OFF
mode of strict compliance with ANSI C
program produces different results? Compliance with strict about writing of the reasons that most modern industrial compilers default to some expansion of its own language, and some by default is C99, etc.
#include <stdio.h>
#include <string.h>
int main (void)
{
int len;
len = strlen ("??=");
printf ("len=%d\n", len);
return 0;
}
Here is the result. In both cases submitted compiler option -w
to suppress warnings:
$ gcc t.c -w
$ ./a.out
len=3
$ gcc t.c -ansi -w
$ ./a.out
len=1
Digraph and trigraph can't work together?
I'm learning digraph and trigraph, and here is the code which I cannot understand. (Yes, I admit that it's extremely ugly.)
This code can compile:
#define _(s) s%:%:s
main(_(_))
<%
__;
%>t
This code can compile, too:
#define _(s) s??=??=s
main(_(_))
<%
__;
%>
However, neither of the following two pieces of code can compile:
#define _(s) s%:??=s
main(_(_))
<%
__;
%>
And
#define _(s) s??=%:s
main(_(_))
<%
__;
%>
This does confuse me: Since the first two pieces of code can compile, I suppose the expansion of digraph and trigraph both take place before the macro expansion. So why can't it compile when digraph and trigraph are used together?
How to make a directed tripartite network from two bipartite networks?Excuse me because I feel like this doubt should be simpler, but I can't find a satisfactory answer.
I have two quantitative bipartite networks (that show the ecological relationships among A-B, and B-C). My problem is that I do not know how join both in order to make a directed quantitative tripartite network (like a typical food web). The B level of each bipartite network have the same vertex composition (in quantitie and vertex names). In addition, the levels A and C can not interact among them. For this reason, the order and direction of the final tripartite network it should be C->B->A
Any suggestions?
Thanks for your attention!
Simple string output not as expected (new line appearing)I have code equivalent to the following to print out a short string:
#include <iostream>
#include <string>
int main(int argc, const char* argv[])
{
std::string s = "finished??/not finished??";
std::cout << s << std::endl;
return 0;
}
But the output is appearing across two lines and losing some characters:
finished
ot finished??
But /n
isn't the new line character! What's happening?
I'm trying to match some chunks if interesting data within a data stream.
There should be a leading <
then four alphanumeric characters, two characters of checksum (or ??
if no shecksum was specified) and a trailing >
.
If the last two characters are alphanumeric, the following code works as expected. If they're ??
though it fails.
// Set up a pre-populated data buffer as an example
std::string haystack = "Fli<data??>bble";
// Set up the regex
static const boost::regex e("<\\w{4}.{2}>");
std::string::const_iterator start, end;
start = haystack.begin();
end = haystack.end();
boost::match_flag_type flags = boost::match_default;
// Try and find something of interest in the buffer
boost::match_results<std::string::const_iterator> what;
bool succeeded = regex_search(start, end, what, e, flags); // <-- returns false
I've not spotted anything in the documentation which suggests this should be the case (all but NULL and newline should be match AIUI).
So what have I missed?
When were the 'and' and 'or' alternative tokens introduced in C++?I've just read this nice piece from Reddit.
They mention and
and or
being "Alternative Tokens" to &&
and ||
I was really unaware of these until now. Of course, everybody knows about the di-graphs and tri-graphs, but and
and or
? Since when? Is this a recent addition to the standard?
I've just checked it with Visual C++ 2008 and it doesn't seem to recognize these as anything other than a syntax error. What's going on?
Are trigraphs still valid C++?We all know about the historical curiosity that is digraphs and trigraphs, but with all the changes made to C++ in recent years I'm curious: are they valid C++14? How about C++17?
Why MSVC compiler converts "??-" sequence to "~" in string literals?I have a hard coded string in my code (which should be used as a file mask), but compiler always changes the "??-" sequence to "~", for example:
const wchar_t textW[] = L"test-??-??-??.txt";
The testW will be "test-~~??.txt" (without quotes).
The same happens for non-unicode strings as well:
const char textA[] = "test-????-??-??.txt";
textA will be "test-??~~??.txt" (without quotes).
My compiler is Microsoft Visual C++ 2008.
I have just tried this with Visual Studio 2013, the string in runtime is correct and intellisense displays the correct value on the tooltip when I'm tracing the app, but... But in the writing mode (when app isn't running) intellisense displays incorrect value with tildas on the tooltip.
Purpose of Trigraph sequences in C++?According to C++'03 Standard 2.3/1:
Before any other processing takes place, each occurrence of one of the following sequences of three characters (“trigraph sequences”) is replaced by the single character indicated in Table 1.
---------------------------------------------------------------------------- | trigraph | replacement | trigraph | replacement | trigraph | replacement | ---------------------------------------------------------------------------- | ??= | # | ??( | [ | ??< | { | | ??/ | \ | ??) | ] | ??> | } | | ??’ | ˆ | ??! | | | ??- | ˜ | ----------------------------------------------------------------------------
In real life that means that code printf( "What??!\n" );
will result in printing What|
because ??!
is a trigraph sequence that is replaced with the |
character.
My question is what purpose of using trigraphs? Is there any practical advantage of using trigraphs?
UPD: In answers was mentioned that some European keyboards don't have all the punctuation characters, so non-US programmers have to use trigraphs in everyday life?
UPD2: Visual Studio 2010 has trigraph support turned off by default.
suggest like google with postgresql trigrams and full text searchI want to do a text search like google suggestions.
I'm using PostgreSQL because of the magical Postgis.
I was thinking on using FTS, but I saw that it could not search partial words, so I found this question, and saw how trigrams works.
The main problem is that the search engine I'm working on is for spanish language. FTS worked great with stemming and dictionaries (synonyms, misspells), UTF and so on. Trigrams worked great for partial words, but they only work for ASCII, and (obviously) they don't use things like dictionaries.
I was thinking if is there any way in which the best things from both could be used.
Is it possible make Full Text Search and Trigrams to work together in PGSQL?
Are trigraphs required to write a newline character in C99 using only ISO 646?Assume that you're writing (portable) C99 code in the invariant set of ISO 646. This means that the \
(backslash, reverse solidus, however you name it) can't be written directly. For instance, one could opt to write a Hello World program as such:
%:include <stdio.h>
%:include <stdlib.h>
int main()
<%
fputs("Hello World!??/n", stdout);
return EXIT_SUCCESS;
%>
However, besides digraphs, I used the ??/
trigraph to write the \
character.
Given my assumptions above, is it possible to either
'\n'
character (which is translated to a newline in <stdio.h>
functions) in a string without the use of trigraphs, orFILE *
without using the '\n'
character?I saw the following code from some legacy codes:
size_t a = 1 ???- 2 :0;
What does the symbol ???-
mean in C++? How should I understand it?
I was looking at the escape sequences for characters in strings in c++ and I noticed there is an escape sequence for a question mark. Can someone tell me why this is? It just seems a little odd and I can't figure out what ? does in a string. Thanks.
What is the meaning of these strange question marks?I came across some weird-looking code. It doesn't even look like C, yet to my surprise it compiles and runs on my C compiler. Is this some non-standard extension to the C language and if so, what is the reason for it?
??=include <stdio.h>
int main()
??<
const char arr[] =
??<
0xF0 ??! 0x0F,
??-0x00,
0xAA ??' 0x55
??>;
for(int i=0; i<sizeof(arr)/sizeof(*arr); i++)
??<
printf("%X??/n", (unsigned char)arr??(i??));
??>
return 0;
??>
Output:
FF
FF
FF
Unknown meta-character in C/C++ string literal?
I created a new project with the following code segment:
char* strange = "(Strange??)";
cout << strange << endl;
resulting in the following output:
(Strange]
Thus translating '??)' -> ']'
Debugging it shows that my char* string literal is actually that value and it's not a stream translation. This is obviously not a meta-character sequence I've ever seen. Some sort of Unicode or wide char sequence perhaps? I don't think so however... I've tried disabling all related project settings to no avail.
Anyone have an explanation?
c99 standard 5.2.1.1 Trigraph sequences
2 EXAMPLE The following source line
printf("Eh???/n");
becomes (after replacement of the trigraph sequence ??/)
printf("Eh?\n");
It's saying that it will replace the trigraph sequence, but it's not .
It's printing "Eh???/n"
Am I missing something ?
how can i skip those warnings? C++Code Added:
bool CHARACTER::SpamAllowBuf(const char *Message)
{
if (!strcmp(Message, "(?˛´c)") || !strcmp(Message, "(μ·)") || !strcmp(Message, "(±a≫Y)") || !strcmp(Message, "(AA??)") || !strcmp(Message, "(≫c¶?)") || !strcmp(Message, "(?đłe)") || !strcmp(Message, "(??C?)") || !strcmp(Message, "(????)") || !strcmp(Message, "(AE??)"))
{
return true;
}
return false;
}
Warnings Gives :
char.cpp:7254:121: warning: trigraph ??) ignored, use -trigraphs to enable
char.cpp:7254:245: warning: trigraph ??) ignored, use -trigraphs to enable
char.cpp:7254:275: warning: trigraph ??) ignored, use -trigraphs to enable
How can i do to skip this warnings?
trigraphs