some of your computer's RAM is data (like cat pictures) and some of it is binary code for your CPU to run.
code always needs to be loaded from disk into RAM to run.
your CPU can only execute binary "machine code", which looks a bit like this:
0xc3, an instruction in x86 that means "return"!)
they're generally not only machine code -- usually they have some other data in them as well, like information about which libraries they need to dynamically link to. But a lot of what's in a binary executable is machine code.
01110101011) valid machine code?
every CPU has an "instruction set" which defines which operations are valid and what they mean.
For example, one of the machine language instructions in the x86 instruction set is
10111000 in binary.
if it's invalid code, the CPU will trigger an interrupt that gets
translated into the
SIGILL signal on Unix.
the OS actually sets permissions (read/write/execute) on different parts of a process's memory. if the memory doesn't have execute permissions, you can't run the instruction there.
these permissions are called "memory protection" and on Linux, you can see a process's memory permissions with
$ cat /proc/$PID/maps
there's a special register called the "instruction pointer". it holds the address of the next instruction to be executed.
most instructions have arguments. For example, the opcode
0xb8 loads a constant into a register, and takes one 32-bit argument (which is the constant to load).
assembly is a slightly more human-readable programming language that we use to make
it easier for humans to write machine code. For example,
mov (%rax),%rdx is
0x488b10 in machine code.
assembly is still challenging to read but at least it's not just a bunch of numbers :)
you can easily translate machine code to more human readable assembly! The program you use to do this is called a disassembler, like
$ objdump -d /bin/cat
similarly, you can translate assembly to machine code with an
C code is translated to machine code by your compiler, like
clang or MSVC
javac, does it get translated to machine code?
the machine code that's running when you run a Java program is the JVM
/usr/bin/java or something)
Java programs are compiled, but they're compiled to JVM bytecode, not machine code.
the JVM is an example of a program that does this -- its JIT will compile frequently called bits of JVM bytecode into machine code.
this is also how some software exploits work -- they try to insert new machine code into your memory and run it.