1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html>
3<!-- Copyright (C) 1988-2021 Free Software Foundation, Inc.
4
5Permission is granted to copy, distribute and/or modify this document
6under the terms of the GNU Free Documentation License, Version 1.3 or
7any later version published by the Free Software Foundation; with the
8Invariant Sections being "Free Software" and "Free Software Needs
9Free Documentation", with the Front-Cover Texts being "A GNU Manual,"
10and with the Back-Cover Texts as in (a) below.
11
12(a) The FSF's Back-Cover Text is: "You are free to copy and modify
13this GNU Manual.  Buying copies from GNU Press supports the FSF in
14developing GNU and promoting software freedom." -->
15<!-- Created by GNU Texinfo 5.1, http://www.gnu.org/software/texinfo/ -->
16<head>
17<title>Debugging with GDB: Character Sets</title>
18
19<meta name="description" content="Debugging with GDB: Character Sets">
20<meta name="keywords" content="Debugging with GDB: Character Sets">
21<meta name="resource-type" content="document">
22<meta name="distribution" content="global">
23<meta name="Generator" content="makeinfo">
24<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
25<link href="index.html#Top" rel="start" title="Top">
26<link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
27<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
28<link href="Data.html#Data" rel="up" title="Data">
29<link href="Caching-Target-Data.html#Caching-Target-Data" rel="next" title="Caching Target Data">
30<link href="Core-File-Generation.html#Core-File-Generation" rel="previous" title="Core File Generation">
31<style type="text/css">
32<!--
33a.summary-letter {text-decoration: none}
34blockquote.smallquotation {font-size: smaller}
35div.display {margin-left: 3.2em}
36div.example {margin-left: 3.2em}
37div.indentedblock {margin-left: 3.2em}
38div.lisp {margin-left: 3.2em}
39div.smalldisplay {margin-left: 3.2em}
40div.smallexample {margin-left: 3.2em}
41div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
42div.smalllisp {margin-left: 3.2em}
43kbd {font-style:oblique}
44pre.display {font-family: inherit}
45pre.format {font-family: inherit}
46pre.menu-comment {font-family: serif}
47pre.menu-preformatted {font-family: serif}
48pre.smalldisplay {font-family: inherit; font-size: smaller}
49pre.smallexample {font-size: smaller}
50pre.smallformat {font-family: inherit; font-size: smaller}
51pre.smalllisp {font-size: smaller}
52span.nocodebreak {white-space:nowrap}
53span.nolinebreak {white-space:nowrap}
54span.roman {font-family:serif; font-weight:normal}
55span.sansserif {font-family:sans-serif; font-weight:normal}
56ul.no-bullet {list-style: none}
57-->
58</style>
59
60
61</head>
62
63<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
64<a name="Character-Sets"></a>
65<div class="header">
66<p>
67Next: <a href="Caching-Target-Data.html#Caching-Target-Data" accesskey="n" rel="next">Caching Target Data</a>, Previous: <a href="Core-File-Generation.html#Core-File-Generation" accesskey="p" rel="previous">Core File Generation</a>, Up: <a href="Data.html#Data" accesskey="u" rel="up">Data</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
68</div>
69<hr>
70<a name="Character-Sets-1"></a>
71<h3 class="section">10.20 Character Sets</h3>
72<a name="index-character-sets"></a>
73<a name="index-charset"></a>
74<a name="index-translating-between-character-sets"></a>
75<a name="index-host-character-set"></a>
76<a name="index-target-character-set"></a>
77
78<p>If the program you are debugging uses a different character set to
79represent characters and strings than the one <small>GDB</small> uses itself,
80<small>GDB</small> can automatically translate between the character sets for
81you.  The character set <small>GDB</small> uses we call the <em>host
82character set</em>; the one the inferior program uses we call the
83<em>target character set</em>.
84</p>
85<p>For example, if you are running <small>GDB</small> on a <small>GNU</small>/Linux system, which
86uses the ISO Latin 1 character set, but you are using <small>GDB</small>&rsquo;s
87remote protocol (see <a href="Remote-Debugging.html#Remote-Debugging">Remote Debugging</a>) to debug a program
88running on an IBM mainframe, which uses the <small>EBCDIC</small> character set,
89then the host character set is Latin-1, and the target character set is
90<small>EBCDIC</small>.  If you give <small>GDB</small> the command <code>set
91target-charset EBCDIC-US</code>, then <small>GDB</small> translates between
92<small>EBCDIC</small> and Latin 1 as you print character or string values, or use
93character and string literals in expressions.
94</p>
95<p><small>GDB</small> has no way to automatically recognize which character set
96the inferior program uses; you must tell it, using the <code>set
97target-charset</code> command, described below.
98</p>
99<p>Here are the commands for controlling <small>GDB</small>&rsquo;s character set
100support:
101</p>
102<dl compact="compact">
103<dt><code>set target-charset <var>charset</var></code></dt>
104<dd><a name="index-set-target_002dcharset"></a>
105<p>Set the current target character set to <var>charset</var>.  To display the
106list of supported target character sets, type
107<kbd>set&nbsp;<span class="nolinebreak">target-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>.
108</p>
109</dd>
110<dt><code>set host-charset <var>charset</var></code></dt>
111<dd><a name="index-set-host_002dcharset"></a>
112<p>Set the current host character set to <var>charset</var>.
113</p>
114<p>By default, <small>GDB</small> uses a host character set appropriate to the
115system it is running on; you can override that default using the
116<code>set host-charset</code> command.  On some systems, <small>GDB</small> cannot
117automatically determine the appropriate host character set.  In this
118case, <small>GDB</small> uses &lsquo;<samp>UTF-8</samp>&rsquo;.
119</p>
120<p><small>GDB</small> can only use certain character sets as its host character
121set.  If you type <kbd>set&nbsp;<span class="nolinebreak">host-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>,
122<small>GDB</small> will list the host character sets it supports.
123</p>
124</dd>
125<dt><code>set charset <var>charset</var></code></dt>
126<dd><a name="index-set-charset"></a>
127<p>Set the current host and target character sets to <var>charset</var>.  As
128above, if you type <kbd>set&nbsp;charset&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>,
129<small>GDB</small> will list the names of the character sets that can be used
130for both host and target.
131</p>
132</dd>
133<dt><code>show charset</code></dt>
134<dd><a name="index-show-charset"></a>
135<p>Show the names of the current host and target character sets.
136</p>
137</dd>
138<dt><code>show host-charset</code></dt>
139<dd><a name="index-show-host_002dcharset"></a>
140<p>Show the name of the current host character set.
141</p>
142</dd>
143<dt><code>show target-charset</code></dt>
144<dd><a name="index-show-target_002dcharset"></a>
145<p>Show the name of the current target character set.
146</p>
147</dd>
148<dt><code>set target-wide-charset <var>charset</var></code></dt>
149<dd><a name="index-set-target_002dwide_002dcharset"></a>
150<p>Set the current target&rsquo;s wide character set to <var>charset</var>.  This is
151the character set used by the target&rsquo;s <code>wchar_t</code> type.  To
152display the list of supported wide character sets, type
153<kbd>set&nbsp;<span class="nolinebreak">target-wide-charset</span>&nbsp;<span class="key">TAB</span><span class="key">TAB</span><!-- /@w --></kbd>.
154</p>
155</dd>
156<dt><code>show target-wide-charset</code></dt>
157<dd><a name="index-show-target_002dwide_002dcharset"></a>
158<p>Show the name of the current target&rsquo;s wide character set.
159</p></dd>
160</dl>
161
162<p>Here is an example of <small>GDB</small>&rsquo;s character set support in action.
163Assume that the following source code has been placed in the file
164<samp>charset-test.c</samp>:
165</p>
166<div class="smallexample">
167<pre class="smallexample">#include &lt;stdio.h&gt;
168
169char ascii_hello[]
170  = {72, 101, 108, 108, 111, 44, 32, 119,
171     111, 114, 108, 100, 33, 10, 0};
172char ibm1047_hello[]
173  = {200, 133, 147, 147, 150, 107, 64, 166,
174     150, 153, 147, 132, 90, 37, 0};
175
176main ()
177{
178  printf (&quot;Hello, world!\n&quot;);
179}
180</pre></div>
181
182<p>In this program, <code>ascii_hello</code> and <code>ibm1047_hello</code> are arrays
183containing the string &lsquo;<samp>Hello, world!</samp>&rsquo; followed by a newline,
184encoded in the <small>ASCII</small> and <small>IBM1047</small> character sets.
185</p>
186<p>We compile the program, and invoke the debugger on it:
187</p>
188<div class="smallexample">
189<pre class="smallexample">$ gcc -g charset-test.c -o charset-test
190$ gdb -nw charset-test
191GNU gdb 2001-12-19-cvs
192Copyright 2001 Free Software Foundation, Inc.
193&hellip;
194(gdb)
195</pre></div>
196
197<p>We can use the <code>show charset</code> command to see what character sets
198<small>GDB</small> is currently using to interpret and display characters and
199strings:
200</p>
201<div class="smallexample">
202<pre class="smallexample">(gdb) show charset
203The current host and target character set is `ISO-8859-1'.
204(gdb)
205</pre></div>
206
207<p>For the sake of printing this manual, let&rsquo;s use <small>ASCII</small> as our
208initial character set:
209</p><div class="smallexample">
210<pre class="smallexample">(gdb) set charset ASCII
211(gdb) show charset
212The current host and target character set is `ASCII'.
213(gdb)
214</pre></div>
215
216<p>Let&rsquo;s assume that <small>ASCII</small> is indeed the correct character set for our
217host system &mdash; in other words, let&rsquo;s assume that if <small>GDB</small> prints
218characters using the <small>ASCII</small> character set, our terminal will display
219them properly.  Since our current target character set is also
220<small>ASCII</small>, the contents of <code>ascii_hello</code> print legibly:
221</p>
222<div class="smallexample">
223<pre class="smallexample">(gdb) print ascii_hello
224$1 = 0x401698 &quot;Hello, world!\n&quot;
225(gdb) print ascii_hello[0]
226$2 = 72 'H'
227(gdb)
228</pre></div>
229
230<p><small>GDB</small> uses the target character set for character and string
231literals you use in expressions:
232</p>
233<div class="smallexample">
234<pre class="smallexample">(gdb) print '+'
235$3 = 43 '+'
236(gdb)
237</pre></div>
238
239<p>The <small>ASCII</small> character set uses the number 43 to encode the &lsquo;<samp>+</samp>&rsquo;
240character.
241</p>
242<p><small>GDB</small> relies on the user to tell it which character set the
243target program uses.  If we print <code>ibm1047_hello</code> while our target
244character set is still <small>ASCII</small>, we get jibberish:
245</p>
246<div class="smallexample">
247<pre class="smallexample">(gdb) print ibm1047_hello
248$4 = 0x4016a8 &quot;\310\205\223\223\226k@\246\226\231\223\204Z%&quot;
249(gdb) print ibm1047_hello[0]
250$5 = 200 '\310'
251(gdb)
252</pre></div>
253
254<p>If we invoke the <code>set target-charset</code> followed by <tt class="key">TAB</tt><tt class="key">TAB</tt>,
255<small>GDB</small> tells us the character sets it supports:
256</p>
257<div class="smallexample">
258<pre class="smallexample">(gdb) set target-charset
259ASCII       EBCDIC-US   IBM1047     ISO-8859-1
260(gdb) set target-charset
261</pre></div>
262
263<p>We can select <small>IBM1047</small> as our target character set, and examine the
264program&rsquo;s strings again.  Now the <small>ASCII</small> string is wrong, but
265<small>GDB</small> translates the contents of <code>ibm1047_hello</code> from the
266target character set, <small>IBM1047</small>, to the host character set,
267<small>ASCII</small>, and they display correctly:
268</p>
269<div class="smallexample">
270<pre class="smallexample">(gdb) set target-charset IBM1047
271(gdb) show charset
272The current host character set is `ASCII'.
273The current target character set is `IBM1047'.
274(gdb) print ascii_hello
275$6 = 0x401698 &quot;\110\145%%?\054\040\167?\162%\144\041\012&quot;
276(gdb) print ascii_hello[0]
277$7 = 72 '\110'
278(gdb) print ibm1047_hello
279$8 = 0x4016a8 &quot;Hello, world!\n&quot;
280(gdb) print ibm1047_hello[0]
281$9 = 200 'H'
282(gdb)
283</pre></div>
284
285<p>As above, <small>GDB</small> uses the target character set for character and
286string literals you use in expressions:
287</p>
288<div class="smallexample">
289<pre class="smallexample">(gdb) print '+'
290$10 = 78 '+'
291(gdb)
292</pre></div>
293
294<p>The <small>IBM1047</small> character set uses the number 78 to encode the &lsquo;<samp>+</samp>&rsquo;
295character.
296</p>
297<hr>
298<div class="header">
299<p>
300Next: <a href="Caching-Target-Data.html#Caching-Target-Data" accesskey="n" rel="next">Caching Target Data</a>, Previous: <a href="Core-File-Generation.html#Core-File-Generation" accesskey="p" rel="previous">Core File Generation</a>, Up: <a href="Data.html#Data" accesskey="u" rel="up">Data</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
301</div>
302
303
304
305</body>
306</html>
307