Overview of Character Data Type
Characters in computers are represented using bits, typically 8 bits (1 byte) per character. The primary encoding scheme discussed is ASCII, which uses 7 bits to represent 128 characters (0-127). Extended ASCII utilizes all 8 bits to represent 256 characters (0-255), including additional symbols and characters needed for non-English languages.
Character Declaration and Usage
- Characters are stored in variables declared with the
chardata type. - The value must be enclosed in single quotes (e.g., 'A').
- Only one character can be stored per variable due to the 1-byte size.
- Integer values can also be assigned to
charvariables; when printed with%cformat specifier, the integer is interpreted as its ASCII character equivalent (e.g., 65 corresponds to 'A'). See also Understanding Data Representation in C Programming for more on how data is represented internally.
Size and Range of Character Variables
- Size: 1 byte (8 bits)
- Unsigned
charrange: 0 to 255 - Signed
charrange: -128 to +127 (using 2's complement representation) - ASCII traditionally uses 7 bits; Extended ASCII makes use of the full 8 bits. For an in-depth explanation of integer size and range concepts that closely relate to characters, refer to Understanding Integer Data Type: Size, Range, and Number Systems Explained.
Signed vs Unsigned Characters Explained
- Signed characters use one bit as a sign bit, allowing negative values, which correspond to values in the extended ASCII range.
- There is a binary equivalence between certain signed negative values and unsigned positive values (e.g., signed -128 equals unsigned 128).
- Negative values in characters do not provide extra functionality but reflect binary representation constraints.
- The most significant bit's place value is negative in signed representation. Further insights on signed and unsigned types and the overflow issues can be explored in Understanding Integer Range Overflow in Signed and Unsigned Types.
Two's Complement Representation
- Negative values are represented in two's complement form.
- Examples:
- -128 is represented by setting the most significant bit (MSB) to 1 and all others to 0.
- Binary representations of signed negative values correspond to specific unsigned positive integers.
Practical Code Insights
- Using
%cwith values assigned tocharvariables prints the corresponding character. - Signed and unsigned chars can print the same characters for different integer values due to binary equivalence.
- Understanding this helps avoid confusion with character and integer representations.
Summary
- Character size is fixed at 1 byte.
- ASCII uses 7 bits, Extended ASCII uses 8 bits.
- Signed
charranges from -128 to 127, unsignedcharfrom 0 to 255. - Negative character values correspond to positive values in binary representation; they don't add extra power.
- Proper use of single quotes and format specifiers is essential when working with character variables.
This foundational knowledge enables programmers to manage character data accurately and understand underlying encoding mechanisms in software development.
Today we will start our discussion on second fundamental data type called character.
Here is the outline of this lecture. Today, we will study a brief overview on character data type.
Size of characters. Range of characters. And we will also talk about
the difference between signed and unsigned characters. Lets have a brief overview.
If you remember from the first lesson itself, I told you that how can we represent characters
in computer. Recall this example of HELLO! And I also told you that how
each character is represented with 8 bits of information. Computer is capable to understand
only 0 and 1. Therefore, we need to represent characters
in 0 and 1 form only. But we don't need to bother about it. Because internally,
all are represented in bits form only. To encode characters, there are several encoding schemes
available. But one of the most common encoding scheme is ASCII encoding scheme.
This is an ASCII table that represents the ASCII encoding scheme. And here you can see
ther are some characters, which are non-printable characters and some characters
are printable characters. The non-printable ones are the control characters
and the printable ones are the characters you can print on the screen. ASCII uses 7 bits to
encode characters therefore, we are available with 128 character in total.
As you can see here, this is from 0 to 127. That is, there are total 128 characters
in ASCII table. But minimum to minimum, we have at least 1 byte
and we know, 1 byte is equal to 8 bits and ASCII require just 7 bits to represent characters
therefore the most significant bit that is eighth bit is set to 0.
let's see how we define and declare a character variable. Here you can see,
I have declared a variable of character data type and assigned it a character.
A variable could be of any name according to your choice but if it is of character data type
it is capable of holding one character at a time. Note down these single quotes over here.
Now this is important. Remember to put single quotes and not double quotes.
If you do so, you might get some unexpected results. Character variable is able to hold only one character at a time.
This is very important. If you want to provide a whole string to it, It wont be able to hold it
because its size is equal to 1 byte. And it wont be able to hold more than 1 character at a time.
Now this is also not necessary to provide only characters to these variables. You can also assign
integer values to them. For example- In this variable name, I have provided a value 65.
Now, this value acts like a character in itself when we are going to print it. When we try to print the contents
of this variable we get a character instead of an integer.
And that totally depends on the format specifier you are using. Here is the code:
Here we can see, I have provided %c as a format specifier.
If you put %d instead of %c, it will print the decimal value. But in this case it will print
a character. Lets see what will be the associated character value for this
particular decimal value. As we know, the associated character for this
decimal value 65 is A. That is why A is printed. If you see the ASCII table,
you can see that A is associated with the decimal value 65.
If you provide 65, it is similar to provide a character 'A' Now let us understand,
why this happened. Because after all everything is in the form of bits only.
Therefore, either you will write a character 'A' or value 65
both are one and the same thing. Because their binary representations are same.
The only difference between a character and an integer is that
character is capable of holding only 1 byte of information on the other hand,
integer is capable of holding either 2 bytes or 4 bytes of information. Both can store an integer.
Both can print integers. But it is better to use them as what they are meant for.
Let's see the size and range of a character. Size of a char variable
or a character is 1 byte long. And range would be
from 0 to 255 in case of unsigned character. And -128 to +127 in case of signed characters.
This representation is coming from 2s complement representation unsigned range is because
we have 8 bits of information available with us. Therefore, the maximum value that we would be able to represent
will be 255. In the traditional ASCII character encoding, we have only 7 bits to
encode the characters. and minimum to minimum we have to have 8 bits.
Therefore, 8th bit is of total waste. there is one more encoding scheme called Extended ASCII encoding scheme
to utilize the 8th bit. or you can say, MSB bit. Therefore, the range is utilized
properly in this encoding scheme. As you can see over here, the range is from 0 to 255
instead of 0 to 127. Note- Apart from the English characters for the non-English speakers,
we have to represent other language characters as well. like for Russian, German, Chinese etc.
For them, other schemes are available. But our concern, is traditional ASCII
character encoding scheme which covers most of the special symbols as well as English characters and digits
that we use in our day to day life. And most of the times, that is sufficient. Therefore, we won't have to bother about
the other schemes much. Let's move to the next topic Difference between
Signed and Unsigned character. I told you the signed and unsigned range for character.
But this is not an easy to digest fact that we have both signed and unsigned range of characters.
Unsigned range is OK, but why signed range? In case of integers,
signed range makes sense. Because in reality, we are not only representing unsigned integers,
but signed integers as well. But, what are negative values doing in characters?
Are they buying some additional powers to us? Even though we won't require negative values at all, but as we know
internally everything is in the form of bits. So we can't resist ourselves
to provide negative values to character variables. But the question is, what happens
when we provide negative values to it. To understand this concept, let's consider
the Extended ASCII table once again. 0 to 127 is same for
both signed and unsigned range. Difference comes in -128 to -1 in signed range
and 128 to 255 in unsigned range. Let's write down the 2's complement representation of
-128 in binary. There is one important point to note. Here we can see, the place value
is -2 raised to the power 7. And, this is not the usual case when we are representing a positive value.
Right? This is -2 raised to the power 7. If we want to represent the negative 128, we have to set this bit
to 1 and reset all the other bits. Because this is -2 raised to the power 7 which is -128.
Therefore, by setting this particular bit we will be able to represent -128. This is the 2s compliment representation.
Always remember that the most significant bits place value is always negative.
In the case of positive numbers, this is quite easy because here the place value will be positive,
if we set this bit to 1 and reset all the other bits, we will be able to represent +128.
Let's try to represent -127. By setting this bit to 1, and this bit to 1
we will be able to represent -127. On the other hand, if I want to represent the value +129
this is also very easy to represent. 2 raised to the power 7, which is equal to 128 and this is 2 raised to the power 0 , which is equal to 1.
Adding these two values together, we get our answer +129. As you can observe,
that these two values are equal. As in the previous case, these two values are equal.
-128 and +128, both have equal binary representation. Similarly -127 and +129 have similar binary representations.
If I would like to represent -126 this would be the binary representation, and for +130
this would be the binary representation. Both are equal. This is -128
and rest of the numbers, if I add them together it would be +127. -128 +127 is equal to -1.
Therefore we need to set all values to 1. If you want to represent +255 then we have to set all
these values to 1 similarly, But the only difference as you can see, is of this place value.
That is why we are getting two different values for the same binary representations.
OK, Lets implement the code to understand. If we try to print -1
and if we try to print +255 both are one and the same thing. Let's implement the code.
Let's see what character is printed for this particular value. This would be the character.
If you see the Extended ASCII table. then the associated character for this particular value would be this.
Let's change the code a little bit. -128 and +128 Both are one and the same thing.
Therefore, they must have to print the same character. Let see whether they do or not. Yes, they are printing the same character.
Therefore, it is verified that +128 and -128 both are same.
Let's see what happens when we change this to +129 and execute it.
This would be the character. And let's see whether -127 and 129 are same or not.
Yes, they are same. Therefore, it is verified that all the things that we had studied
up to now is correct. So the final conclusion is negative values wont buy you any
additional power in case of character variables. Always remember that each negative value is equivalent to
some positive value in Extended ASCII character set. Because after all
every thing is binary only. And one more thing, the idea of range exceeding
conditions for characters is similar to integers that we had studied in previous lecture.
Therefore, it is not worth mentioning each and every point once again. If you want, you can refer
the previous lecture and relate the concepts accordingly. Let's have a summary of whatever
we had studied up till now. Size of character is equal to 1 byte, Signed character range is from -128 to +127
Unsigned character range is from 0 to 255 Negative values won't buy you any additional powers.
In traditional ASCII table, each character requires 7 bits. In Extended ASCII table,
each character utilize all 8 bits. OK friends, this is it for now.
See you in the next lecture.
ASCII uses 7 bits to represent 128 characters primarily for English letters, digits, and control symbols, with values from 0 to 127. Extended ASCII utilizes all 8 bits (1 byte) to represent 256 characters, adding symbols and characters for non-English languages.
Use the char data type and enclose a single character in single quotes, like char ch = 'A';. You can also assign an integer value corresponding to an ASCII code (e.g., char ch = 65;), which will represent the character equivalent when printed with the %c format specifier.
A char variable occupies 1 byte (8 bits). An unsigned char ranges from 0 to 255, representing only positive values. A signed char ranges from -128 to +127, using the most significant bit as a sign bit with two's complement representation.
Signed chars use the most significant bit as a sign bit, allowing values from -128 to 127. Negative values in signed chars correspond to unsigned positive values through binary equivalence (e.g., signed -128 equals unsigned 128). These negative values reflect the binary encoding, not additional character functionality.
Two's complement represents negative numbers by inverting all the bits of the positive value and adding one. For example, -128 sets the most significant bit to 1 and all others to 0. This binary value matches the unsigned character 128, enabling signed and unsigned chars to share the same bit patterns but differ in interpretation.
Assign the integer value corresponding to the character's ASCII code to a char variable and print it using the %c format specifier in functions like printf. For example, char ch = 65; printf("%c", ch); prints 'A'. This method ensures correct character representation regardless of signedness.
Always enclose single characters in single quotes when assigning to char variables. Understand the signed vs unsigned range to avoid unexpected negative values when interpreting the data. Use correct format specifiers (%c for characters) when printing to avoid confusion between integer and character outputs.
Heads up!
This summary and transcript were automatically generated using AI with the Free YouTube Transcript Summary Tool by LunaNotes.
Generate a summary for freeRelated Summaries
Understanding Integer Data Type: Size, Range, and Number Systems Explained
This summary explores the integer data type, its memory allocation, and how computers represent integer ranges using decimal and binary number systems. It also covers calculating integer range for different byte sizes, including the use of two's complement for signed integers.
Understanding Data Representation in C Programming
Explore how data representation works in computers, focusing on integers and binary systems in C programming.
Comprehensive Guide to Integer Data Types and Modifiers in C Programming
This article explores integer data type modifiers in C, including short, long, signed, and unsigned. Learn about memory size differences, value ranges, and how to use symbolic constants and printf specifiers to work effectively with these data types.
Understanding Advanced printf Usage and Integer Behaviors in C Programming
This comprehensive summary explores key concepts in C programming, including nested printf functions, string width specifiers, character variable overflow, integer declarations, and nuances of signed versus unsigned integer arithmetic. Learn how printf returns values, how formatting affects output, and how integer operations behave in different contexts.
Understanding Variable Data Types and Operators in C++
Learn about variable data types and operators in C++. Discover syntax, examples, and functions for programming in C++.
Most Viewed Summaries
Kolonyalismo at Imperyalismo: Ang Kasaysayan ng Pagsakop sa Pilipinas
Tuklasin ang kasaysayan ng kolonyalismo at imperyalismo sa Pilipinas sa pamamagitan ni Ferdinand Magellan.
A Comprehensive Guide to Using Stable Diffusion Forge UI
Explore the Stable Diffusion Forge UI, customizable settings, models, and more to enhance your image generation experience.
Pamamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakaran ng mga Espanyol sa Pilipinas, at ang epekto nito sa mga Pilipino.
Mastering Inpainting with Stable Diffusion: Fix Mistakes and Enhance Your Images
Learn to fix mistakes and enhance images with Stable Diffusion's inpainting features effectively.
Pamaraan at Patakarang Kolonyal ng mga Espanyol sa Pilipinas
Tuklasin ang mga pamamaraan at patakarang kolonyal ng mga Espanyol sa Pilipinas at ang mga epekto nito sa mga Pilipino.

