The notion of “FPGA programming” may be a little misleading. Actually, unlike a CPU, there is no program to run on an FPGA. FPGA programming consists of creating a logic circuit that will perform a requested algorithm and describe it using a hardware description language. Consequently, the building blocks of this algorithm are not variables, conditions and a set of operations to be performed, but rather logic gates, adders, registers and multiplexers. The described circuit will eventually be compiled into logic modules - the building blocks of FPGAs.
Building blocks of a logic circuit
All computers are electronic machines that perform operations using boolean logic. In boolean logic, there are only two values a variable can be assigned: true and false. FPGA logic modules are designed to perform arbitrary boolean operations with a specific number of inputs and outputs.
Combinational logic implements arbitrary truth tables and basic operations on multi-bit boolean variables. Multiplexers select one of the input paths depending on a condition. Ripple-carry adders implement fast additions and subtractions on arbitrary-length variables without engaging combinational logic. Finally, registers are utilized to store boolean values.
Languages used in FPGA programming
Hardware description language is used to assemble these FPGA building blocks into a circuit that will perform a specific task, making the programming different compared to typical high-level languages. The two most popular hardware description languages are VHDL and Verilog. VHDL’s syntax is similar to Pascal. Verilog, however, is similar to C.
Writing a counter with VHDL
In order to demonstrate how to use hardware description language, we will write a simple but omnipresent hardware module - a counter. Our language of choice will be VHDL. Despite being a complex and somewhat archaic language with certain pitfalls, simple hardware designs in VHDL are self-explanatory.
The VHDL file begins with the keywords library and use. First, we define the libraries we’re about to use in the code below. Its counterpart in C language is the include section.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
Library std_logic_1164 contains definitions of std_logic and std_logic_vector. These two types of model logic states relate to single and multi-bit signals, respectively. Apart from zeros and ones, these types also contain special values to match with logical uncertainty and uninitialised values, which are used in hardware design simulation. Library numeric_std contains signed and unsigned data types and conversion functions from and to std_logic types. In our example, apart from std_logic and std_logic_vector, the type unsigned will be used.
entity universal_counter is
generic (
BIT_WIDTH : integer := 8
);
port (
clk : in std_logic;
rst : in std_logic;
load : in std_logic;
enable : in std_logic;
input : in std_logic_vector((BIT_WIDTH-1) downto 0);
output : out std_logic_vector((BIT_WIDTH-1) downto 0)
);
end universal_counter;
The second part of the code is an entity declaration. We declare universal_counter as a design element with generics and ports. Generics in VHDL enable us to synthesize the same element in different variants. In this case, multiple counters may have different widths and, when no generic is passed, the compiler assumes a default value of 8. Ports in VHDL define the inputs and outputs of the entity. In the case of universal_counter, we have five input ports and one output. Ports include: clock clk, reset rst, load which assigns new values from the input to the counter, enable which makes the count work, and eventually output with the value of the counter.
architecture rtl of universal_counter is
signal value_r, value_next : unsigned((BIT_WIDTH-1) downto 0);
begin
The circuit design of an entity is specified in the architecture section. A single circuit may have multiple architectures in VHDL. This stems from the fact that different targets require different design approaches. For example, the architecture of an FPGA entity must exploit all features of the target FPGA device, such as logic modules, memories and DSP blocks (if available). On the other hand, architecture of ASIC circuits must be self-contained and must avoid resource-heavy components such as memories and multipliers.
In our case, we’re specifying register transfer level architecture. Register transfer level means utilizing both sequential elements (such as registers) and combinational elements (such as logic) to map the circuit. This architecture contains two signals of unsigned type. Signals are hardware components to which a value can be assigned within the architecture. In our example, signals are used to implement a flip-flop. Signal value_r stores the current value of the output, and signal value_next stores the next value, which is derived using combinational logic.
All statements defined inside the architecture are run in parallel. The architecture definition starts with the keyword begin.
process(clk, rst)
begin
if rst = '1' then
value_r <= (others => '0');
elsif rising_edge(clk) then
value_r <= value_next;
end if;
end process;
Process is a special, event-driven statement in VHDL. Instructions inside a process statement are run sequentially every time a value on the sensitivity list (in our case clk and rst) is altered. Furthermore, VHDL processes may define their own variables to be used inside the process. In our case, we assign zero to signal value_r if reset is triggered and assign a derived value each clock cycle otherwise.
value_next <= unsigned(input) when load = '1' else
value_r + 1 when enable = '1' else
value_r;
The assignment is used to derive value_next from known logical values inside the architecture. Since signals are of the unsigned type, input must be converted before being used, but don’t worry, this and many other cases will be resolved by the compiler and will be implemented using combinational logic: multiplexers and adders.
output <= std_logic_vector(value_r);
end rtl;
The last statement assigns the value from the flip-flop to the output. VHDL architecture is finalized with the keyword end.
Simulating a counter with VHDL
To simulate a counter we must define a stimulus. A stimulus is a virtual entity that generates input signals to a design. More complex stimuli will not only generate input signals but also compare output from the hardware design with the output generated by the reference model to ensure that the design works as expected.
As with the universal_counter entity, the stimulus source file begins with library and entity declarations.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity tb_counter is
end entity;
Because we’re not expecting to either provide or receive any data, the entity section can be empty. For the same reason, the architecture section of the file is more complex than the previous one.
architecture tb of tb_counter is
component universal_counter is
generic (
BIT_WIDTH : integer := 8
);
port (
clk : in std_logic;
rst : in std_logic;
load : in std_logic;
enable : in std_logic;
input : in std_logic_vector((BIT_WIDTH-1) downto 0);
output : out std_logic_vector((BIT_WIDTH-1) downto 0)
);
end component;
signal s_clk : std_logic;
signal s_rst : std_logic;
signal s_load : std_logic;
signal s_enable : std_logic;
signal s_input : std_logic_vector(3 downto 0);
signal s_output : std_logic_vector(3 downto 0);
begin
The architecture definition consists of a component declaration that resembles the entity of universal_counter. Actually, it will resolve it and use a previously defined entity. Furthermore, we need signals to communicate with the component. In the next section we will implement a 4-bit counter. Thus, both s_input and s_output signals have a width of 4 bits.
DUT: universal_counter generic map (
4
) port map (
clk => s_clk,
rst => s_rst,
load => s_load,
enable => s_enable,
input => s_input,
output => s_output
);
The map section connects the wires of the component with signals defined in the architecture section. Since we used only one generic, to specify that the counter is 4-bit, we don’t need to specify the name of the generic. On the contrary, when mapping ports, each port must be manually connected to the matching signal. The DUT name specified at the beginning of the component map is the name of the component. It allows us to access that specific component in the simulation.
CLK_GEN: process(s_clk)
begin
if s_clk = 'U' then
s_clk <= '1';
else
s_clk <= not s_clk after 5 ns;
end if;
end process;
To generate the clock, we need to exploit the event-driven nature of the process statement. In process CLK_GEN, at the beginning of the simulation, each signal is set as uninitialized (value U in std_logic). Hence, an event on s_clk is generated that triggers the process that first sets the signal, and then negates it after a half of the clock cycle (we assume 10 nanoseconds).
STIMULUS: process
begin
s_rst <= '1';
s_load <= '0';
s_enable <= '0';
s_input <= (others => '1');
wait for 10 ns;
s_rst <= '0';
s_load <= '1';
wait for 10 ns;
s_load <= '0';
s_enable <= '1';
wait;
end process;
end;
When the process statement is used without a sensitivity list, the instructions are run sequentially once. First, we set the reset signal to initialize the component. Furthermore, we explicitly set all values of input signals. In the next clock cycle, we disable the reset signal and load the input value of the counter, which was previously set to its maximum value. Lastly, we set the enable signal. Using the HDL simulation tool, we can observe the behavior of our entity. The plotted signal values have been depicted below.
Fig. 1 Plotted signal values
Synthesizing a counter
Synthesis is a process that maps the described hardware into the logic modules described in the first chapter. To perform a synthesis of the presented counter we just need to specify the target device and desired pinout if we’re about to use specific boards. No other entities are required.
Netlist - a scheme that shows how the FPGA compiler understood the circuit in hardware description language is depicted below. As we see, the design requires a multi-bit adder, two multiplexers and a registry.
Fig. 2 A netlist scheme
The technology map viewer enables us to see how the netlist was eventually mapped onto the specific hardware components of a target device. Nevertheless, the technology map for even very simple circuits such as a multi-bit counter can be very complicated. Thus, it won’t be depicted.
Summary
In this article, the building blocks of an FPGA device, a simple circuit design, and its stimulus were explained.
Finally, the code of the counter looks as follows:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity universal_counter is
generic (
BIT_WIDTH : integer := 8
);
port (
clk : in std_logic_vector;
rst : in std_logic_vector;
load : in std_logic_vector;
enable : in std_logic_vector;
input : in std_logic_vector((BIT_WIDTH-1) downto 0);
output : in std_logic_vector((BIT_WIDTH-1) downto 0)
);
end universal_counter;
architecture rtl of universal_counter is
signal value_r, value_next : unsigned((BIT_WIDTH-1) downto 0);
begin
process(clk, rst)
begin
if rst = '1' then
value_r <= (others => '0');
elsif rising_edge(clk) then
value_r <= value_next;
end if;
end process;
value_next <= unsigned(input) when load = '1' else
value_r + 1 when enable = '1' else
value_r;
output <= std_logic_vector(value_r);
end rtl;
The code of the stimulus is presented below.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity tb_counter is
end entity;
architecture tb of tb_counter is
component universal_counter is
generic (
BIT_WIDTH : integer := 8
);
port (
clk : in std_logic;
rst : in std_logic;
load : in std_logic;
enable : in std_logic;
input : in std_logic_vector((BIT_WIDTH-1) downto 0);
output : out std_logic_vector((BIT_WIDTH-1) downto 0)
);
end component;
signal s_clk : std_logic;
signal s_rst : std_logic;
signal s_load : std_logic;
signal s_enable : std_logic;
signal s_input : std_logic_vector(3 downto 0);
signal s_output : std_logic_vector(3 downto 0);
begin
DUT: universal_counter generic map (
4
) port map (
clk => s_clk,
rst => s_rst,
load => s_load,
enable => s_enable,
input => s_input,
output => s_output
);
CLK_GEN: process(s_clk)
begin
if s_clk = 'U' then
s_clk <= '1';
else
s_clk <= not s_clk after 5 ns;
end if;
end process;
STIMULUS: process
begin
s_rst <= '1';
s_load <= '0';
s_enable <= '0';
s_input <= (others => '1');
wait for 10 ns;
s_rst <= '0';
s_load <= '1';
wait for 10 ns;
s_load <= '0';
s_enable <= '1';
wait;
end process;
end;