JDasm

Multi purpose disassembler, format decompiler, and hex editor.

Download as .zip Download as .tar.gz View on GitHub
System Memory: Link
Processor data types: Link
Overview of data types: Link
Binary hardware formats, and files: Link

Reading, and editing binary data.

When you open any file or disk drive, you will see the following output.



This is called a hex editor. It lets you read the raw binary data in files or the entire disk as it is and allows you to even change its binary data.


To use a hex editor properly, you need to understand how the information is displayed and what system memory is made out of.


Each single 0 to 9 and A to F character is four binary digits.


Hex0123456789ABCDEF
Binary0000000100100011010001010110011110001001101010111100110111101111


Hex is used to make the binary view more compact and to keep it readable. It is meant to be a short easy to read representation of binary.


For example, 19FE hex maps into 0001 1001 1111 1110 binary. Also 00611 is 0000 0000 0110 0001 0001.


Double-clicking any hex digit will let you type a 0 to 9 and A to F value. You can use the arrow keys to navigate the binary. Hitting enter or ESC will exit edit mode.


Note that the web version lets you analyze binary formats and files as they are, but does not let you edit the files. Only the java application allows you to do edits to binary files.


There is also a square around every two hex digits because every eight binary digits are one position in system memory.


Each hex digit is four binary digits, so every two digits are eight binary digits.


Eight binary digits are called a byte. All memory devices operate in bytes: CD ROMs, blue rays, DVD, Solid-state drives, RAM, and floppy disks. Even video game cartages: GameBoy color cartages, Nintendo DS cartages, switch cartages, game disks (playstation, wii).


It is a memory standard, and is something you should understand if you want to be a good software/game developer, or software engineer.


Memory byte position.


The highlighted square above is position 9 in the binary file.



The highlighted square above is position 15 in the binary file.



The highlighted square above is position 16 in the binary file. And so on. Also, any memory device can not have data smaller than a byte (eight binary digits) as it is a memory standard.


Binary files are measured in the number of bytes in the binary file. The size of a memory device is measured by how many bytes it can hold.

System Memory.

A standard unit of memory is 8 binary digits, which is called a byte.


Position zero is byte one. Position one is then byte two from the start of the memory device to the end of the memory device.


Every 1000 metric size of bytes forum a measurement for how big your disk drive is, floppy disk, RAM Memory, or any digital memory conceivable.


Also, 1000 is 1 kilo in metric. So 1000 kilo is 1 mega in metric. The word “byte” is added to the metric sizes to forum how many bytes in metric.


MetricPrefixSymbol Multiplier (Traditional Notation)ExponentialDescription
YottaY1,000,000,000,000,000,000,000,0001024Septillion
ZettaZ1,000,000,000,000,000,000,0001021Sextillion
ExaE1,000,000,000,000,000,0001018Quintillion
PetaP1,000,000,000,000,0001015Quadrillion
TeraT1,000,000,000,0001012Trillion
GigaG1,000,000,000109Billion
MegaM1,000,000106Million
kilok1,000103Thousand
hectoh100102Hundred
decada10101Ten
baseb1100One
decid1/1010-1Tenth
centic1/10010-2Hundredth
millim1/1,00010-3Thousandth
microµ1/1,000,00010-6Millionth
nanon1/1,000,000,00010-9Billionth
picop1/1,000,000,000,00010-12Trillionth
femtof1/1,000,000,000,000,00010-15Quadrillionth
attoa1/1,000,000,000,000,000,00010-18Quintillionth
zeptoz1/1,000,000,000,000,000,000,00010-21Sextillionth
yoctoy1/1,000,000,000,000,000,000,000,00010-24Septillionth


1 k is one kilo meaning 1000 of something. Thus 1 kb means 1000 bytes in metric using prefix notation.


The same applies to ohms for resistors in electronics. As 1 kilo is 1000 as a number. Thus an ohm is the thing just as the byte is the singular unit.


So 1-kilo ohm is actually a 1000 ohm resistor. The same is true with wattage as kilo wat to Giga wat. In which Giga is the measure and the wat is the thing.


Every position of memory is in bytes, no matter what you are using to store the bytes. It all works the same (even ram memory).


However, terms like a petabyte of memory are rarely used. You will, however, hear sizes like this when talking supercomputers.


When it comes to video game cartages and roms we can dump all the byte data into a file on an computer. Emulators are designed to read the roms and play them.


Data types.

Data types are limited and are the same across all systems (game boy color to PC). Even different system architecture types use the same data types. They are your building blocks for creating new picture formats, creating Disk drive formats such as FAT32, NTFS, or creating something new, or just writing software on the platform, or game console in general.


The processor must be able to do arithmetic operations with the standard primitive data types to work with file formats, or even an LCD display in an handheld.


Clicking a data type limits the hex editor to edit the bytes that make the data type. Double-clicking a data type will let you enter a value manually.


Data Length.

Processors are designed to read bytes of data, which is the standard unit of memory in all systems, and consoles. The read byte then can be used with various binary operations.


Originally two bytes created a “word”, then two words created a “double word” shortened to “DWORD”. Also, two “DWORD” created a “quadword” called “QWORD”.


Data types are generally in sizes 8-bit, 16-bit, 32-bit, 64-bit as they are doubled from each prior word size in byte format.


These are the original names given to bytes, and their “WORD” lengths. We could Read 2 bytes that form a Word just as two letters make a word in English, which we can add as a 16-bit number.


Today these words are no longer used. Except for in-machine code translation of a video game rom, or binary. As the original meaning is used when reading data by CPU.


Today when we read these “WORD” grouping sizes. Then wish to do arithmetic with different read lengths of data. We specify types like “byte”, “short” (WORD), “integer” (DWORD), “long” (QWORD).


The original names stay intact in disassembly between basic CPU operations, like add, multiply, or divide when reading different word sizes of data.


IT also does not matter if you have an ARM core, X86 core, or embedded core. The primitive types are the same even if the processor runs entirely differently binary-encoded instructions.


We will cover processor instruction sets in another document, for now we will talk about the data types which are the same no matter which platform you design software on.


A byte is still a byte of memory no matter what the system cpu architecture is. The lengths of data are also still in “word size”. Also, integers are still the same.


The processor must be able to do arithmetic and operations with the standard primitive data types. In order to work with file formats. The standard data types are the building blocks for all format types, and external hardware.


The only thing that can change between different system processors and architectures is the byte order.

Little endian, and big endian.

Let’s say we read a DWORD (four bytes), from memory that are 11, 22, 33, 44 in hex.


In a processor that reads bytes in little-endian byte order: Bytes are read in reverse order = 44, 33, 22, 11.


In a processor that reads bytes in big-endian byte order: Bytes are read in order = 11, 22, 33, 44.


Big-endian is not used much, if at all, in many systems. This also means big-endian systems switch the byte order using arithmetic operations to maintain compatibility with the majority of file formats.


Basically little-endian is the format that is used the most today, so big-endian systems do not do well as they must flip the byte order when reading a value and flip the byte order when saving changes to files.

Integer numbers.

When we talk integers, we are talking numbers without a decimal point. That operate the same as regular numbers without a decimal point.


We can count to 9 before adding one to the next place value. In which 9+1=10. So the number of times we have counted to 10 is in the ten position. The number we have counted to 100 is in the 100 position. As 10 can be added to ten times to reach 100 position, for the number of 100s in counting.


A max value for 7 digits with 10 per place value 7 across is 10000000-1=9999999. We could easily say 10 to the power of 7 minus 1, which is 10^7-1=9999999.


In binary, we limit ourselves to 1 and 0. We go 1+1=10, which is the number of twos we have counted instead of ten.


So if we count to place value 2 a second time, we forum place value 100, which is 4 because each place value can be counted to twice at each position, making each next position a multiple of two instead of ten in counting.


This means a number 8 in length has 2^8-1=255 combinations, which is called a byte. Also is what one position of memory is. A WORD which is two bytes, has a max value of 2^16-1=65535.


Using two symbols or ten symbols does not change how a number counts to the next place value, or is added and caries per column. The number of symbols we are using before the next place and carry is called a number base.


You can make any number system you like using any grouping of numbers you like see Base conversion method.


It just is that using two symbols as off and on for a transistor makes it easy to implement in a digital system. See binary, and radix.

Negative, and positive numbers.

BinaryUnsinged DecimalSinged Decimal
000000
000111
001022
001133
010044
010155
011066
011177
10008-8
10019-7
101010-6
101111-5
110012-4
110113-3
111014-2
111115-1


Signed and unsigned numbers are added the same way. The only difference is how we display the value.


Adding 3 + 8 = 11


If we replace each number with their singed values we end up with the correct result as a singed number.


The singed value for 11 is -5, and the singed value for 8 is -8, and finally the singed value for 3 is 3.


Changes into 3+-8=-5


How it works.


The numbers are split into two. The numbers descend the further you go down using regular add. This way, we do not need to design a unique add circuit for negative and positive numbers.


Say we add 1111 = -1, and 1001 = 7. We add 15+7=22. So 22 in binary is 10110. The first four numbers are 0110, which is the size of the add operation, then 0110 in the table is 6 Singed decimal.


So -1 + 7 = 6. This is because the carry is completely disregarded as it is outside of the number. Allowing us to go from negative to positive using a regular add.


Also, since the last binary digit can be counted to twice, then the last binary digit became the split point for the negative and positive half. The half point is called the sing bit.


Because, of how numbers add the singed bit is set one for negative mainly because when we add past 1111 = -1, then carry is removed and we land on the positive half.


So when the sing bit is set one we are in the negative descending half till the last value 1111 = -1, and is set zero for positive 0000 = 0 and up.


See swarthmore.edu Binary arithmetic.


The CPU does not care if it is singed or unsinged value as a singed add or subtract does not exist in any CPU as it does the same add/subtract operation for both at hardware level.


It is how we display the value that changes.


In reality, your source code can have singed numbers, but by disassembling its machine operations. You, then could recreate the code as all unsigned numbers without error.


You will actually see this in code when translating machine code. You actually have to make the determination of the number type based on how it is used.

Floating point numbers.

Now, if we talk about a Float, we are talking about a DWORD number with a positional decimal point.


You lose some binary combinations because the positional number for the exponent takes up space.


The decimal point can be placed anywhere in your integer number, allowing fractional arithmetic.


This is also a primitive data type that is the same across all processor cores and mobile devices.


A double-precision number gets its name from being twice the size of a float number as a QWORD, giving you a larger integer and bigger exponent section.


These numbers are added the same as regular numbers. With the integer part adjusted by the exponent position. It is A Standard IEEE type.


The implementation of said binary number format is the same on mobile as it is on PC, and game consoles as well.


The exponent is one byte in A float. So you have a 255 positional point for rally big or rally small number values.


It is split in half for negative and positive exponent. The integer part is 23^2-1.


Many call the integer part a fraction part. However, it does not reflect how a float works.


Float numbers use regular ADD in the CPU, which is used, for integers, not floating-point numbers. This is how it is implemented if there is no native float add in the CPU. I will demonstrate how we add float numbers.


A float number with a value of 0.1000000000000000000000000.


Is the same as adding 1+1=10 in binary.


Adding in the decimal point, it becomes 0.1+0.1=1.0 in binary.


The two numbers are lined up relative to exponent using a shift then added using a regular CPU ADD. A shift is the number of times to move a value to the left or right. For example 11 to the left 3 times is 11000 and back to the right 3 times is 11.


This is how floating-point arithmetic is done if the CPU does not have a Float add operation.


Adding in the decimal point means you can have values that are a division of 2 rather than a multiple of 2.


As 0.1 is the same as one divided by two = 0.5. As adding 0.5 twice is the same as adding 0.1+0.1=1.0 as binary.


The following Link. Will go in more depth, for you if you like.

Text data.

BinaryHexCharBinaryHexCharBinaryHexCharBinaryHexChar
0000000000NUL0010000020SP0100000040@0110000060`
0000000101SOH0010000121!0100000141A0110000161a
0000001002STX00100010220100001042B0110001062b
0000001103ETX0010001123#0100001143C0110001163c
0000010004EOT0010010024$0100010044D0110010064d
0000010105ENQ0010010125%0100010145E0110010165e
0000011006ACK0010011026&0100011046F0110011066f
0000011107BEL00100111270100011147G0110011167g
0000100008BS0010100028(0100100048H0110100068h
0000100109HT0010100129)0100100149I0110100169i
000010100ALF001010102A*010010104AJ011010106Aj
000010110BVT001010112B+010010114BK011010116Bk
000011000CFF001011002C,010011004CL011011006Cl
000011010DCR001011012D-010011014DM011011016Dm
000011100ESO001011102E.010011104EN011011106En
000011110FSI001011112F/010011114FO011011116Fo
0001000010DLE001100003000101000050P0111000070p
0001000111DC1001100013110101000151Q0111000171q
0001001012DC2001100103220101001052R0111001072r
0001001113DC3001100113330101001153S0111001173s
0001010014DC4001101003440101010054T0111010074t
0001010115NAK001101013550101010155U0111010175u
0001011016SYN001101103660101011056V0111011076v
0001011117ETB001101113770101011157W0111011177w
0001100018CAN001110003880101100058X0111100078x
0001100119EM001110013990101100159Y0111100179y
000110101ASUB001110103A:010110105AZ011110107Az
000110111BESC001110113B;010110115B[011110117B{
000111001CFS001111003C<010111005C\011111007C|
000111011DGS001111013D=010111015D]011111017D}
000111101ERS001111103E>010111105E^011111107E~
000111111FUS001111113F?010111115F_011111117FDEL


A Char is short for charterer. Each key code on your keyboard sends a byte, which corresponds to the standard binary values in the table.


This format stays the same between systems. Otherwise, documents would fail to load and would end up printing out gibberish.


Char codes 00 to 1F hex are not really text codes, so are never saved to text documents or web pages.


Processor cores come equipped with text processing functions. Such as changing a binary number to a hex number. A byte is changed into two bytes with values 0 to 9, A to F.


An array of characters is called a String of text. You could say a text document is an Array of characters, or is a really long string.


The space bar is 20 hex. Without space as a code, there is no space between words in documents. Also, 30 hex to 39 hex is your 0 to 9 numbers. Smalls and capitals ascend from 40 and 60 hex.


When dividing a binary number by any base 2 to 36, any remainder that is less than 10 is added to 30 hex, which creates numbers 0 to 9 per place value.


When remainders are higher than 10 we subtract 10 and add the extra to 60 hex for the alphabet, which creates numbers A to Z per place value.


This is how the toString method is implemented in all programming languages to convert any number to any number base 2 to 36.


Text data is also the same across systems, with each byte value representing a different character picture to draw from a font file. It is also important that the font file stores the pictures relevant to each byte value (Otherwise we end up with gibberish).


When decoding a binary file, you will often run into Signatures sometimes called magic numbers.


When decoded as characters usually mean something like in a Microsoft executable. The first two bytes are 4D 5A = “MZ”.


In which the two-character string “MZ” is the initials for Mark Zbikowski.


ZIP compressed files start with the code 50 4B which are the font characters “PK”, which is short for PKWARE.


System protocols in operating systems also accept a string of text in this format. It is also the same across systems.


There is also Unicode text, also called UTF16 text, which is rarely used. It reads a word instead of a byte. This way, there are 2^16-1=65535 combinations to represent characters.


Also, UTF16 can end with UTF16-LE, or UTF16-BE, which is little, or big-endian byte order.


Also you can read over The Unicode Standard: A Technical Introduction.


The original UTF8 codes are still the same values as UTF16. The difference is that it has more combinations for more characters after 255.


Also, UTF8 is sometimes called ASCII text, and it is important to remember that we are actually referring to the UTF8 standard.

Fast binary conversion.

At the start I showed how hex is used to shorten a binary number and can quickly be changed between binary and hex.


The same thing can be done with any number base that is a multiple of 2, such as base 4, base 8 (Octal), base 16 (hexadecimal), base 32, can be directly translated to and from binary.


The reason it works is that the last value, F = 15 is exactly equal to the last value of a 4-bit binary number 1111=15.


In which F is the last hex digit. So F + 1 = 10. In place value. Thus to change the value back to binary, we go.


10 = 1 (0001), 0 (0000) = 0001 0000


So 9E hex is.


9E = 9 (1001), E (1110) = 1001 1110


Dividing a binary number in sections of 16 per place value actually causes us to move in sections of 4 binary digits.


101000101010100101010100101001010 divide by 16
10100010101010010101010010100 divide by 16
1010001010101001010101001 divide by 16
101000101010100101010 divide by 16
10100010101010010 divide by 16
1010001010101 divide by 16
101000101 divide by 16
10100 divide by 16
1 last place value in 0 to 15 digit.


Dividing the number by base 16 is not necessary since we can just read the number in sections of 4 bits.


Splitting any size of binary number in sections of 4 from right to left. Allows us to convert to hex instantly.


Split=(1),(0100),(0101),(0101),(0010),(1010),(1001),(0100),(1010)


Hex=(1),(4),(5),(5),(2),(A),(9),(4),(A)


Thus hex characters can easily be translated from two-byte characters to a single byte. Writing the real byte value to memory.


Modern x86 cores have this operation built-in.


Octal01234567
Binary000001010011100101110111


Octal uses base 8, which is the byte character codes 0 to 7. The last Octal digit is also equal to the last 3 digit binary combination 111 = 7.


So 7+1=10 in pace value. Thus every octal digit is 3 binary digits. Thus to change the value back to binary, we go.


10 = 1 (001), 0 (000) = 001 000


So 53 Octal is.


53 = 5 (101), 3 (011) = 101 011


101000101010100101010100101001010


Splitting any size of binary number in sections of 3 from right to left. Allows us to convert to octal instantly.


Split = (101), (000), (101), (010), (100), (101), (010), (100), (101), (001), (010)


Octal = (5), (0), (5), (2), (4), (5), (2), (4), (5), (1), (2)


Base 10 does not match any multiple of 2, base 4, base 8 (Octal), base 16 (hexadecimal), base 32.


Number bases that are not a multiple of 2 must be divided by number base, and the remainders are each place value.


In our case, we divide by 10 and add the remainders to character byte codes to forum our string.


Changing a number back into an integer is just a matter of multiplying the digits by the multiple of the number base.

Array.

An array is a set of bytes or words. Written one after another in memory.


All primitive data types can be read one after another linearly in an array, which gives the best performance. However, it can get more complicated.


Each number in the array can locate to a string. Which a string is a variable in length array built on char UTF8/UTF16 or another array in the case of a 2D array.


The x86 processor has an address system built in to Handel reading elements one after another, plus an index.


Scale index base. Note that the response is more complicated than it had to be so I will put it simply bellow.


The base address is the location of the bytes we are storing in memory sequentially. We add which byte we want to read to the base address (this is called the index).


If we want byte three in the array of bytes then we set index three. We then read the third byte from the base address. Now if the array is a grouping of words rather than bytes. We must multiply the index by 2, because each word is two bytes.


An array of double words is 4 bytes so we add the starting position of our array then add which element we want to read, but multiply by 4. Lastly, an array of qword is index times 8.


ARM cores do not have a fancy addressing system built-in. It takes two ARM processor instructions to run such code. So reading arrayed, and indexed files take longer. Or reading file system array structure. That contains all files on your disk drive.


Programming languages are built on primitives and arrays of primitives types in aligned memory. You will learn more about this in the “Code” document.


Overview.

Programming languages all use the same primitive data types.


JavaScript JavaScript Assumes all numbers are float64 (double). During any bitwise operation changes to int32, and back to float64.
Javascript is unique as we can do every number type using bitwise when we need a 32-bit integer or smaller.
We can extend 32-bit int to 64 with a little bit of additional code as needed. A tool called type script changes defined number types into javascript code.
Java Java assumes that all integers are signed, so the regular toString method splits the number into two from the center and adds the sing.
In Java 8, and later a new unsigned toString method was added to convert to a string that shows the integer as is. As an unsigned value.
Adding, and subtracting singed, and unsigned numbers are the same as regular numbers. All that changes is how we display the value as characters.
Languages C, and C++Lets you pick how you want a integer to be treated in code. All data types are also the same with better control.


The primitive data types are the same as the CPU/ALU (Arithmetic logic unit) is what processes them, not the programming language. The binary arithmetic units in CPU processors are the same. As positional arithmetic never changes format. Even if you switch processor types, also, text data is standardized.


Also, memory is in bytes no matter what system you use. Thus data types are still in word size even if you switch processor architecture.


Understanding the primitive data types at machine code level and how the language treats them. Can make a big difference in understanding your code.


Generally speaking. The primitives are the same across systems. The same is true for code recompiled from other systems in emulators.


Integer numbers and floating-point numbers are the same as they can not be anything other than positional in 2. A string of text is the same.


Processors today even have built-in number base conversion to characters and back again to a number. Using the standard character codes.


Even Arrays are read the same across systems. However, x86 cores are excellent with reading matrices, and arrays, because of the address system.

Video memory.

Displays are made of little squares that we can individually change the color of called pixels. Each pixel has a red, green, and blue value that lets us set the color of a pixel.


Red, green, and blue colors of light add together, so it is not hard to visualize the added color in your head. You may want to learn what additive colors of light are, and practice making red, green, blue values a little.


The color of each pixel is stored in video memory, some people like to call it graphics memory.


The operating system does not set up video memory. It is the job of the converters you solder onto a system that go to a display output such as HDMI, VGA, display port, or internal display.


The video out components you put onto a modern motherboard or game console are standardized and connect to a HDMI, VGA, display port, or internal display.


You can set the number of bits that is used to define one red, green, blue color.


The higher the number of bits you use the more colors you can make. Typically 8 bits is used for each red, green, and blue color.


You can change the RAM address location you wish to use for video memory.


The format you write to video memory is generally pairs of three bytes for each RGB color per pixel.


The first three bytes are the top left corner and continues across the screen per three bytes. We start on the next line when we reach the end of the screen.


The distance across is a multiple of three labeled as X. Up and down is the number of pixels each line is labelled as Y. The rest is your graphics function and the pair of three bytes you wish to write for color.


The number of lines up or down and the number of values across before the next line has to do with the set resolution.


You can also adjust how many times in one second that you want the video output components to display changes made to the video memory. This is usually set to 60 times in one second known as 60hertz.


When creating an operating system we call this area of memory in address space the frame buffer.


Even handheld game consoles like the Nintendo DS has a video memory location that you can write to for setting each individual pixel color on the screens. When you reached the end of the first display in memory and go to set the next three bytes you then start at the top left corner of the touch screen. This is also how multi-screen graphics is done with one video memory location.


The Nintendo DS used two ARM cores. You would write your graphics functions on ARM9, and the other for running the games instructions and logic on ARM7.


You could also do mix and match if you wanted, however it was simpler and better to separate the jobs on two cores. The format the video memory worked in is the same as PC or any graphics device.


The Nintendo DS was amazing and fun to build games on if you understood the basics.


When we insert a graphics card we can disable the on board video memory location and issue commands to set a pixels color in x and y in the graphics card video memory.


The graphics card acts as a separate computer with it’s own video memory and display output converter. A graphics card can also run our graphics function or code so the CPU does not have too.


Such as filling in a rectangle of pixels or calculating 3D angels. This frees up the CPU because the graphics card then does the graphics drawing functions in it’s video memory, and frees up RAM memory.


In actuality, you can build an operating system that only does software-rendered graphics, which runs on all systems, and all motherboard configurations. It also is very easy to do. It also is good coding practice.


However, it is recommended to add Hardware accelerated graphics through an GPU by implementing basic method calls on the GPU.


Video memory is very standard across all consoles and mobile/PC systems. You may also enjoy the following on graphics cards and video memory. techtarget.com storage Video memory.


Also, the Nintendo 3DS had three screens one after another in video memory. One for left eye, and one for right eye, and the touch screen.


The Bit Map picture format is based on the raw binary forum of graphics memory and is a hardware independent picture format.


File formats start with what we usually call a file header. A header usually has a signature which should always be the same byte values. If the bytes do not match, then the file type we are reading is most likely corrupted.


A header can specify the width, and height if it is a picture and various things after the file signature.


Pictures generally store pairs of three bytes in Red, Green, Blue per pixel in an array. After index exceeds the width, it moves to the next line of the picture, till picture height.


If the picture format is stored in any other way than red, green, and blue then we would have to change it back into red, green, blue before writing it to video memory to display the picture.


I use to actually write bit map pictures one byte at a time in a hex editor. It is the equivalent of doing pixel art. Also is good practice.


When you click save and see your picture load in a picture program. That is the moment it is magical. Considering you just wrote the picture in 1’s, and 0’s.


Lastly you can get very creative in software with raw graphics format. You can compare the difference in colors and define edges and shapes and create an artificial intelligence that learns and understands its surroundings, or create video/picture enhancers, and filters. The possibilities are limitless.

Audio format.

Audio is also another standard format. It can be a bit confusing at first. It also does not change between systems, just like video memory.


An audio file can consist of sizes byte 1, word 2, dword 4, qword 8 array.


An integer that is a dword (32 binary digits) gives a range of control 2^32-1. The value is the point to move the magnet in the speaker. We call this an sample point.


The speed at each integer (sample point) is given to the PCM (Pulse-code modulation) device per second is called the sample rate. A sample rate of 10 means 10 points per second.


We would also call this 10 hertz. We use metric to represent larger numbers like 1 kilo hertz would mean 1000 points a second.


The easiest way to think about audio is that sound is movement (vibration). Thus it is best to describe how it is recorded and played back.


The values reflect the time of each recorded position the magnet was in a microphone. Allowing us to capture in time the movement and vibration of sound.


The speed at which each value is recorded is called the sample rate in hertz. Which is how many sample points we are recording in one second.


The faster the sample rate, then the more precise the audio reproduction is. Also, The bigger of an integer number we use, the more precise each point is for the position of the magnet in the speaker coil.


Now lets say we wish to output two different audio signals to two different speakers. We specify to the PCM that we are using 2 audio channels.


Now normally a sample rate of 10 means 10 points per second. However, this time the PCM will read two points at a time. This means we need 20 points, for 10 points per second using 2 audio channels set in the PCM.


There is no limit on how many audio channels you can have, and how high in quality you can go, but remember that when you generate an audio stream with two channels that every second point is the second audio channel. Also, with three audio channels, then every third point is the third audio channel and so on.


This is how uncompressed audio works across all systems. Thus generally, this is how audio is given as an audio stream at system level. Similar to how Video memory works.


In order for audio files to be playable, they must convert to the standard PCM audio stream format. Then we set number of channels and sample rate, and sample integer size in the PCM.


The wave audio format does not have to convert format as it is wrote in raw PCM audio data format. The wave audio header specifies the points per second and number of bits for each point.


You can learn more about digital audio and the wave audio format viable link.


Lastly you can get very creative in software with raw audio stream format. You can compare the difference in vibration and define sounds and create an artificial intelligence that learns and understands, or you can create sound enhancers, and filters. The possibilities are limitless.

Closing.

Both wave audio files, and bitmaps require very little effort to play/display, or to create/modify. As they are stored in standard hardware format for both audio, or graphics.


Both phones, game systems, tables, and PC deal with external hardware which only understands these basic formats and are generally hardware independent.


Some of the basic things we can computationally do with the standard hardware formats graphics/audio.


We can create our own stream samples using basic sin(x) at different speeds of vibration and decay rate calculations to make synthesized sounds.


We do not have to write the PCM data to an wave audio file to play standard PCM audio. We simply give the integers in RAM memory to the PCM device.


Any developer that has built a piano application, for android, console, or PC should already know what uncompressed audio looks like and how to play it.


We can also do graphics using sin and cos to do rotation around x, y, z vertices to make 3D models. It does help to store small pictures containing colors to draw between vertices to make textured models.


Another file that is fun to read byte by byte. Is GIF pictures. W3 GIF picture format.


GIF pictures store which colors are used in the picture into an array. The array is then indexed by the color data section.


Thus some picture programs will not let you edit GIF pictures in Red, Green, Blue because each color has to be profiled that you wish to use.


Unless you change the format to say Bit map, then make changes, then save to GIF. A GIF picture also does not lose any color data and is lossless.


It can be very slow doing everything one step at a time on the CPU, so it becomes apparent it is faster creating hardware with some of the code.


The only thing requiring any knowhow across systems is if the system has hardware accelerated function built in for audio, or graphics, thus getting all hardware operation codes right can be tricky in building a game console emulator.


Modern game consoles can even keep runnable code that can be updated by system update and games can use the code when needed from the game making it even harder to emulate modern game consoles even though the basic formats never change.

Moving onto Machine code.

With this basic understanding you know how to write pictures into video memory to display them. You know how to translate audio into a standard stream to play them.


You know how to design audio functions, or graphics function that work on any platforms or system hardware.


You understand that you can offload a lot of the work using separate processing units such as a GPU, but you need to design drivers that know how to interact with the GPU.


You know everything you need to know to design your own game system/PC/mobile/tablet and to program it and make your own OS, but you lack one small thing.


How do we select a CPU and how do we program it in machine code. In the next section on machine code I will explain how to map instruction sets by processor type and how we program them in machine code by processor type. An simple introduction to processor instruction sets and coding.