A programming language is a system of notation for writing computer programs.
Languages can be viewed through two components.
Semantics (function) - refers to the meaning behind the code, such as what a particular command or function does.
Syntax (form) - refers to the rules and structure of a programming language, such as how code is written and formatted.
Together, semantics and syntax define a language
Additionally, languages can be defined by its specification or a implementation
Language specification - a formal description of its syntax and semantics.
Language implementation - the actual software that reads programs (compile or interprets) and executes it.
Specification’s are nice because they leave no ambiguity in the language’s interpretation and the implementation is what brings the specification to life
Semantics
The semantics of a programming language refers to the meaning of its elements - the “what happens” when the code is executed.
Data types and Structures
- Every language has its own set of data types (such as integers, strings, or booleans) and structures (like arrays or objects).
- What happens when you manipulate these components and how these components react with other components ultimately dictate the language’s semantics.
Control Flow
This is how the code execution is controlled (including loops, conditional statements, and functions)
Defines the order of execution, what is executed, how to make decisions, and to handle subroutines
Error Handling
- Defines how the language handles errors
- Throwing exceptions, logging errors, or halting execution.
Memory Management
- How memory is used and allocated/deallocated
- Includes variable creation/destruction and garbage collection
Concurrency and Parallelism
- Depends if language supports it
- Essentially defines how multiple tasks will be done simultaneously
Semantics can vary significantly from one language to another.
Syntax
Syntax refers to a set of rules that define how programs written in the language must be structured. Syntax is made of lexical elements and grammatical structure.
- Lexical elements and grammatical structure are what govern how statements and expressions are correctly formed.
Lexical Elements:
These are the basic components of a language. They include keywords (like if
, else
, for
), identifiers (variable or function names), operators (+
, -
, *
, /
), literals (like 1
, "hello"
), and punctuation symbols ({}
, ()
, ,
).
Grammatical Structure (Syntax):
This defines how the basic components can be combined to form valid statements or expressions in the language. For example, an if
statement in Python is written as if condition: statement
, where condition
is an expression that evaluates to a boolean value and statement
is the code that gets executed if the condition is true.
Programming languages, such as C, C++, and Java, have syntax that is based on the C programming language which is very concise syntax. Other programming languages, such as Python, Ruby, and JavaScript, have very readable syntax.
Important
Low-level vs High-level Languages
Programming languages are classified as low-level or high-level depending on abstraction from the hardware.
Low-level languages (assembly languages or C) have little abstraction from the computer’s instruction set architecture. They offer high control over the hardware but require a lot knowledge about the computer’s architecture and memory model.
High-level languages (Python or Java) abstract the precise hardware details. They have features like garbage collection and bounds checking that make them easier to use but perhaps less performant for very specific tasks.
Scripting vs System Languages
Scripting languages (Python,JavaScript) are typically used for short and simple tasks. They are interpreted rather than compiled, and they have features good for text processing or automating system tasks.
System languages (C,Rust) are used for writing operating systems and other low-level applications. They give close control over system resources and are compiled to machine code
Interpreter vs Compiler
An Interpreter reads and executes the code line by line. If there is an error, it stops at that line and reports it - making it easier to debug. Every time the program is ran, an interpreter has to translate the source code into machine code.
A Compiler converts the entire source code into machine code before the program is run. If there is an error, it is reported during the compilation step and the program doesn’t run until all errors are resolved - making it harder to debug. Once the program is compiled, it can be run multiple times without further translation
The choice between an interpreted language and a compiled language doesn’t matter as much as it used to.
- Java compiles into bytecode and then gets interpreted by the Java Virtual Machine (JVM)
- Just-In-Time (JIT) compilers can compile before execution.
Static vs Dynamic Typing
In statically-typed languages (C++,Java) you must declare a variable’s type ahead of time and it can’t change
In dynamically-typed languages (Python,Ruby) you do not have to declare a variable’s type and it can be of any type.
An easy and intuitive example is that in a statically-typed language, if you declare a variable as an integer, you cannot later assign a string to it where as in a dynamically-typed language you can assign any type of value to the variable without having to declare it.
Type System
A type system is a collection of rules that assign a property called a “type” to the programming constructs (variables, expressions, functions or modules) a computer program is composed of. A “type” is used to reduce the possibility for bugs in programs by defining the connections between different parts of a computer program, and then checking that they are connected in a consistent way.
- An easy and intuitive example is declaring a variable of type “integer” and assigning it a value of 5. The type system will give a type error because it ensures that the variable can only hold integer values
Intended Use: System, General, Specific, Script
System languages are used for system programming, to develop the core functioning parts of an operating system
General-purpose languages are used to write software that will be used for a wide variety of applications.
Specific-purpose languages are for a specific domain, like SQL for databases or HTML for web pages.
Scripting languages are used for small tasks like text processing or network scripting.
Standard library and run-time system
- Every language comes with a set of built-in functions, classes, and modules that form its standard library - providing utilities for tasks
- Runtime system refers to the software/hardware environment within which a program runs.