Validations & Transformations.
A first-principles walkthrough of the rules and guidelines you keep in mind while designing APIs — the gate every piece of client data passes through before any business logic runs. Why it exists, where it lives, how it works underneath, with complete implementations in Go and Python shown side by side.
What They Are & Why They Exist
Validations and transformations are not a single feature you install. They are a set of rules and guidelines you keep in mind while designing your APIs. Almost everything about them traces back to two goals, and those two goals are the lens through which every later decision is made:
What counts as "incoming client data"
The validation boundary covers every kind of data a client can push at the server, not just the request body. All of the following must be treated as untrusted until proven otherwise:
- JSON payloads — the body of a
POST/PUT/PATCHrequest. - Query parameters — e.g.
?page=2&limit=20. - Path parameters — e.g. the
5in/users/5. - Headers — anything the client sets on the request.
Where It Lives: The Three Layers
To pinpoint where validation happens you first need the standard mental model of a backend: three stacked layers, each with one job. A request travels down the stack and the response travels back up.
The reason controller and service are deliberately separated is to keep HTTP concerns (error codes, success codes, response shape, validation) in one place, and pure business logic in another. The controller calls the service; the service may or may not call the repository, depending on what the request needs.
The Request Lifecycle
A request does not magically appear inside your controller method. It arrives, gets route-matched, and only then is the controller method assigned to that route invoked. That ordering is the key to placing validation correctly.
- Data arrives. A client opens an HTTP connection and sends a request — typically a JSON payload, plus query and path parameters and headers.
- Route matching runs. The router compares the incoming path against its patterns and selects the controller method registered for the matched route.
- Validation & transformation pipeline fires. Before any significant logic runs, the matched controller passes the raw data through the pipeline.
- Business logic runs. Only if the data passed does the controller call the service method.
- Service → repository. The service executes logic, optionally calling repository methods that hit the database.
- Response travels back up. Whatever the service returns is handed to the controller, which returns it to the caller over the HTTP connection.
The Execution Point
Validation and transformation happen at exactly one point: inside the controller layer, immediately after the route is matched and before any service method is called or any business logic runs. Concretely it is implemented as a reusable middleware or utility function that you hand a schema — the description of what the data should look like — and it walks every field of the incoming payload against that schema.
Why It Is Critical — The 500 vs 400 Story
The clearest way to feel the need for validation is to watch what happens without it. Imagine a "create book" API that expects a JSON body with a name field that should be a string. A client instead sends the number 0:
{ "name": 0 }In your Postgres schema the column was almost certainly declared like this — a string column that cannot be null:
CREATE TABLE books ( id SERIAL PRIMARY KEY, name TEXT NOT NULL -- expects text, refuses an integer );
With no validation, that 0 sails through the controller, through the service, down into the repository, and becomes an INSERT. Postgres checks the column type, sees a number where it demanded TEXT, and rejects the operation. The database call fails, and the client receives:
{ "error": "Internal Server Error" }That is a poor user experience. A 500 says "something unexpected broke inside us" when in reality the client sent bad input, and gives them nothing to fix. With a validation pipeline at the entry point, the server catches the mistake instantly, the database is never touched, and the client gets an actionable 400 Bad Request:
{ "errors": [ "name: expected a string, received a number" ] }Inside the Pipeline — Step by Step
A robust pipeline processes a field in layers, and the order matters: each check only runs if the previous one passed. Take a single field, name, required to be a string of length 5–100:
- Existence check. Does
nameexist in the payload at all? If not →"name is required", stop. - Type check. It exists — is it a string, or did the client send an array / boolean / number? If wrong type →
"expected a string", stop. - Constraint check. It is a string — does it satisfy the restrictions? Length 5–100. A 2-character string or a whole paragraph both fail with a tailored message.
In code, libraries express this as a declarative schema. Go commonly uses struct tags with go-playground/validator; Python uses Pydantic, where the same model both validates and transforms. Here is a complete, runnable version of the pipeline in each:
package books import ( "encoding/json" "fmt" "net/http" "strings" "github.com/go-playground/validator/v10" ) // one shared, reusable validator (the pipeline engine) var validate = validator.New() // CreateBook is the SCHEMA. The tags declare every rule: // required -> existence check // the Go type -> type check (string) // min=5,max=100 -> constraint check type CreateBook struct { Name string `json:"name" validate:"required,min=5,max=100"` } // runPipeline decodes then validates; returns 400-ready messages. func runPipeline(r *http.Request) (*CreateBook, []string) { var body CreateBook // 1. TYPE CHECK during decode: a JSON number for `name` // cannot unmarshal into a Go string -> caught here. if err := json.NewDecoder(r.Body).Decode(&body); err != nil { return nil, []string{"name: expected a string"} } // 2. EXISTENCE + CONSTRAINT checks in one pass. if err := validate.Struct(body); err != nil { return nil, formatErrors(err) } return &body, nil } // formatErrors turns validator output into human messages. func formatErrors(err error) []string { var msgs []string for _, e := range err.(validator.ValidationErrors) { f := strings.ToLower(e.Field()) switch e.Tag() { case "required": msgs = append(msgs, fmt.Sprintf("%s: is required", f)) case "min": msgs = append(msgs, fmt.Sprintf("%s: min %s chars", f, e.Param())) case "max": msgs = append(msgs, fmt.Sprintf("%s: max %s chars", f, e.Param())) default: msgs = append(msgs, fmt.Sprintf("%s: failed %s", f, e.Tag())) } } return msgs }
from fastapi import HTTPException from pydantic import BaseModel, Field, ValidationError # CreateBook is the SCHEMA. Each annotation = one rule: # no default -> existence check (required) # : str -> type check # min/max_length -> constraint check class CreateBook(BaseModel): name: str = Field(min_length=5, max_length=100) def run_pipeline(raw: dict) -> CreateBook: """Decode + validate in one call. Raises a 400-ready error on failure.""" try: # constructing the model runs all three layers # in order: existence -> type -> constraint return CreateBook(**raw) except ValidationError as e: # reshape pydantic errors into clean strings messages = [ f"{err['loc'][0]}: {err['msg']}" for err in e.errors() ] raise HTTPException(status_code=400, detail=messages) # run_pipeline({}) -> 400 ["name: Field required"] # run_pipeline({"name": 0}) -> 400 ["name: Input should be # a valid string"] # run_pipeline({"name": "ab"})-> 400 ["name: String should # have at least 5 chars"] # run_pipeline({"name": "Dune"}) ... still <5 -> 400 # run_pipeline({"name": "The Hobbit"}) -> OK
These are not a fixed taxonomy you must memorize — there can be more depending on requirements — but in practice four flavors cover almost everything you will meet while designing APIs. The real takeaway is a question you keep asking yourself: how strict and how specific do I want to be with this data?
Type Validation
The most basic flavor: does the field match the data type the API expects? Is it a string, a number, a boolean, an array, or a nested JSON object — and not something else? It can also apply recursively: an array field may require that every element is a string, so element 0 being a number is itself a type error.
{ "stringField":"x", "numberField":"x", "arrayField":"x", "boolField":"x" }numberField: expected number, received string arrayField: expected array, received string boolField: expected boolean, received string
{ "arrayField": [1, 2] } → arrayField[0]: expected string, received number{ "stringField":"something", "numberField":10, "arrayField":["one","two"], "boolField":false }package validate import ( "encoding/json" "errors" "net/http" ) // *bool so we can tell "false" apart from "missing". type TypePayload struct { StringField string `json:"stringField" validate:"required"` NumberField float64 `json:"numberField" validate:"required"` // dive = recurse INTO the slice; each element required ArrayField []string `json:"arrayField" validate:"required,dive,required"` BoolField *bool `json:"boolField" validate:"required"` } func parseTypes(r *http.Request) (*TypePayload, error) { var p TypePayload dec := json.NewDecoder(r.Body) dec.DisallowUnknownFields() // reject stray keys // json.Decode enforces the BASE types: a string for // numberField / arrayField / boolField fails to unmarshal. if err := dec.Decode(&p); err != nil { return nil, errors.New("a field has the wrong data type") } // existence + recursive element check if err := validate.Struct(p); err != nil { return nil, err } return &p, nil } // {"numberField":"x"} -> expected number, received string // {"arrayField":[1,2]} -> arrayField[0]: expected string
from pydantic import BaseModel from typing import List class TypePayload(BaseModel): stringField: str numberField: float arrayField: List[str] # recursive: EVERY element a string boolField: bool # Strict, explicit type errors: # # TypePayload(stringField="x", numberField="x", # arrayField="x", boolField="x") # -> numberField: Input should be a valid number # -> arrayField: Input should be a valid list # -> boolField: Input should be a valid boolean # # TypePayload(..., arrayField=[1, 2], ...) # -> arrayField.0: Input should be a valid string # -> arrayField.1: Input should be a valid string # # TypePayload(stringField="something", numberField=10, # arrayField=["one","two"], boolField=False) -> OK
Syntactic Validation
Here the type is already correct (a string) but the string must follow a specific structural pattern. The validation algorithm checks the shape of the value:
- Email — a local part, then the
@character, then a domain made of a name plus a top-level domain (.com,.co.in,.org, …). There are thousands of valid TLDs; the structure is what gets checked. - Phone number — a country code followed by the right number of digits for that country (e.g. a country code then a 10-digit number).
- Date — must match the expected structure, e.g.
YYYY-MM-DD(year, then month, then day).
{ }email: required phone: required date: required
{ "email":"randomstring", "phone":1234567, "date":"2025-11-05" }email: invalid email format phone: expected a string, received a number
{ "email":"test@test.com", "phone":"1234567", "date":"2025-11-05" } → 200package validate import ( "errors" "regexp" ) // optional + then 7-15 digits (country code + national no.) var phoneRe = regexp.MustCompile(`^\+?[0-9]{7,15}$`) type Contact struct { Email string `validate:"required,email"` // local @ domain.tld Phone string `validate:"required"` // checked below Date string `validate:"required,datetime=2006-01-02"` // YYYY-MM-DD } // checkSyntax handles the phone pattern (validator's // built-ins cover email + date structure already). func (c Contact) checkSyntax() error { if !phoneRe.MatchString(c.Phone) { return errors.New("phone: invalid phone number format") } return nil } // "randomstring" -> email: invalid email format // phone sent as JSON number -> received a number, want string // "2025-13-40" -> date: does not match YYYY-MM-DD
from datetime import date from pydantic import BaseModel, EmailStr, Field class Contact(BaseModel): email: EmailStr # local @ domain.tld # country code + 7-15 digits phone: str = Field(pattern=r"^\+?[0-9]{7,15}$") date: date # only accepts YYYY-MM-DD # Contact(email="randomstring", phone=1234567, # date="2025-11-05") # -> email: value is not a valid email address # -> phone: Input should be a valid string # # Contact(email="bad@", ...) # -> email: there must be something after the @-sign # # Contact(email="test@test.com", phone="1234567", # date="2025-11-05") -> OK
Semantic Validation
The type is right and the syntax is right, but does the value make sense in the real world? Semantic checks encode domain logic:
- A date of birth cannot be in the future — if today is
2025-01-11, a DOB of2025-01-13is nonsense even though it is a perfectly valid date. - An age of
365(or430) does not make sense — at least not yet. A reasonable human age range is roughly1to120.
{ "dateOfBirth":"2026-06-12", "age":43 }dateOfBirth: date of birth cannot be in the future
{ "dateOfBirth":"1995-06-12", "age":430 }age: number must be less than or equal to 120
{ "dateOfBirth":"1995-06-12", "age":43 } → 200package validate import ( "errors" "time" ) type Profile struct { DateOfBirth string `validate:"required,datetime=2006-01-02"` // gte/lte cover the "430 is impossible" semantic bound Age int `validate:"required,gte=1,lte=120"` } // Type & syntax can't express "not in the future" — // semantics need real logic against the real clock. func (p Profile) checkSemantics() error { dob, err := time.Parse("2006-01-02", p.DateOfBirth) if err != nil { return errors.New("dateOfBirth: invalid date") } if dob.After(time.Now()) { return errors.New( "dateOfBirth: date of birth cannot be in the future") } return nil } // {"dateOfBirth":"2026-06-12"} -> cannot be in the future // {"age":430} -> age: must be 120 or less
from datetime import date from pydantic import BaseModel, Field, field_validator class Profile(BaseModel): dateOfBirth: date age: int = Field(ge=1, le=120) # 430 -> must be <= 120 @field_validator("dateOfBirth") @classmethod def not_in_future(cls, v: date) -> date: if v > date.today(): raise ValueError( "date of birth cannot be in the future") return v # Profile(dateOfBirth="2026-06-12", age=43) # -> dateOfBirth: date of birth cannot be in the future # Profile(dateOfBirth="1995-06-12", age=430) # -> age: Input should be less than or equal to 120 # Profile(dateOfBirth="1995-06-12", age=43) -> OK
Complex / Dependent Validation
The most powerful flavor: a field's rules depend on other fields. The pipeline can encode arbitrary cross-field logic:
- Password confirmation —
passwordConfirmationmust exactly matchpassword; andpasswordmust be at least 8 characters. Two different strings yield "passwords don't match"; a 6-character password yields "string must contain at least 8 characters". - Conditional requirement —
partneris optional whenmarriedisfalse, but the momentmarriedistruethepartnername becomes required: "partner name is required when married is true".
{ "password":"random", "passwordConfirmation":"another", "married":false }password: string must contain at least 8 characters passwordConfirmation: passwords don't match
{ "password":"random12", "passwordConfirmation":"random12", "married":true }partner: partner name is required when married is true
{ "password":"random12", "passwordConfirmation":"random12", "married":true, "partner":"Sam" } → 200package validate type Signup struct { Password string `json:"password" validate:"required,min=8"` // eqfield: must equal another field on the struct PasswordConf string `json:"passwordConfirmation" \ validate:"required,eqfield=Password"` Married *bool `json:"married" validate:"required"` // required_if: partner required only if married == true Partner string `json:"partner" \ validate:"required_if=Married true"` } // {password:"random", passwordConfirmation:"another", // married:false} // -> password: must be at least 8 characters // -> passwordConfirmation: passwords don't match // // {password:"random12", passwordConfirmation:"random12", // married:true} // -> partner: required when married is true
from pydantic import BaseModel, Field, model_validator class Signup(BaseModel): password: str = Field(min_length=8) passwordConfirmation: str married: bool partner: str | None = None # mode="after": runs once all fields are parsed, # so it can compare them against each other. @model_validator(mode="after") def cross_field_rules(self): if self.password != self.passwordConfirmation: raise ValueError("passwords don't match") if self.married and not self.partner: raise ValueError( "partner name is required when married is true") return self # Signup(password="random", passwordConfirmation="another", # married=False) # -> password: String should have at least 8 characters # -> Value error, passwords don't match
Transformation as Type Casting
Transformation means executing operations on the incoming data to convert it into a desirable format — either before validation (so it can pass) or after it (to ready it for the service layer). The textbook case is pagination query parameters.
Consider GET /bookmarks?page=2&limit=20 with these requirements:
page— a number, greater than 0 and less than 500.limit— a number, greater than 0 and less than 10,000 (at most 10,000 records returned at a time).
page arrives as the string "2", not the number 2. Validation that demands a number fails on its very first check, even though the client did nothing wrong — strings are simply how query params work.The fix is not to error. It is the server's responsibility to cast the string into a number first. "Casting" is forcing one data type to become another. Cast "2" → 2, then run the numeric validations. That casting step is the transformation.
package validate import ( "errors" "net/url" "strconv" ) type Pagination struct { Page int Limit int } // Query params are ALWAYS strings, so we must CAST // (transform) before we can VALIDATE the numbers. func parsePagination(q url.Values) (*Pagination, error) { // TRANSFORM: force the string "2" into the int 2 page, err := strconv.Atoi(q.Get("page")) if err != nil { return nil, errors.New("page: must be a number") } limit, err := strconv.Atoi(q.Get("limit")) if err != nil { return nil, errors.New("limit: must be a number") } // VALIDATE the now-numeric values if page <= 0 || page >= 500 { return nil, errors.New("page: must be 1..499") } if limit <= 0 || limit >= 10000 { return nil, errors.New("limit: must be 1..9999") } return &Pagination{Page: page, Limit: limit}, nil }
from pydantic import BaseModel, Field class Pagination(BaseModel): # Pydantic AUTO-CASTS the incoming string "2" into # int 2 (transform), THEN enforces gt/lt (validate) # -- both steps in one schema, in the right order. page: int = Field(gt=0, lt=500) limit: int = Field(gt=0, lt=10_000) # request: /bookmarks?page=2&limit=20 # (both values arrive as strings) # # Pagination(page="2", limit="20") # -> Pagination(page=2, limit=20) # real ints # # Pagination(page="0", limit="20") # -> page: Input should be greater than 0 # # Pagination(page="abc", limit="20") # -> page: Input should be a valid integer
Transformation as Normalization
Transformation also cleans up data after it validates, reshaping it into what the service layer prefers. The server quietly normalizes the payload before doing anything with it:
- An email sent with mixed case —
Test@TEST.com— comes back stored as all-lowercasetest@test.com. - A phone string gets a
+prefix injected before it. - A date gets recomputed into the canonical format the backend expects.
{ "email":"Test@TEST.com", "phone":"1234567", "date":"2025-11-05" }{ "email":"test@test.com", "phone":"+1234567", "date":"2025-11-05" }package validate import "strings" // normalize runs AFTER validation passes and BEFORE // the data is handed to the service layer. func (c *Contact) normalize() { // lowercase + trim the email c.Email = strings.ToLower(strings.TrimSpace(c.Email)) // inject the leading + if it is missing c.Phone = strings.TrimSpace(c.Phone) if !strings.HasPrefix(c.Phone, "+") { c.Phone = "+" + c.Phone } } // "Test@TEST.com" -> "test@test.com" // "1234567" -> "+1234567"
from pydantic import BaseModel, EmailStr, field_validator class Contact(BaseModel): email: EmailStr phone: str # field_validators double as transformers: whatever # they return REPLACES the incoming value. @field_validator("email") @classmethod def lower_email(cls, v: str) -> str: return v.strip().lower() # normalize case @field_validator("phone") @classmethod def add_plus(cls, v: str) -> str: v = v.strip() return v if v.startswith("+") else "+" + v # Contact(email="Test@TEST.com", phone="1234567") # -> email="test@test.com", phone="+1234567"
One Combined Pipeline
In practice validations and transformations are paired into a single pipeline. The reason is locality: the entire "input data layer" — every requirement and every operation performed on incoming data before any business logic runs — lives in one place. You never have to hunt across files to discover what an endpoint expects or what it quietly does to the data. The pipeline can transform → validate, or validate → transform, in either order, as the requirements demand.
Frontend vs Backend Validation
A common and dangerous mistake is treating frontend validation as a replacement for backend validation. They are not interchangeable — they serve different purposes, and you need both for every API.
Why can't the backend trust the frontend? Because a server can have many different clients. A polished web app might validate beautifully — but the same API can also be hit directly through an API client like Postman or Insomnia, where there is no frontend acting as a proxy and therefore no frontend validation at all. If the backend ever depended on the frontend for security or integrity, the server would break the moment the client changed.
When the two are wired up correctly they complement each other: in a web form, an invalid email blocks the submit and shows an inline error — no API call is even made. Fix it to a real address and the submit fires a proper request; the payload is sent, the server validates again, and returns 200. The frontend delivered instant feedback; the backend delivered the guarantee.
Full Annotated Controller
Putting the whole picture together: a complete controller that route-matches, runs the combined pipeline (transform → validate → normalize) before any business logic, returns a clean 400 on failure, and only then calls the service layer — which calls the repository. Both ends shown in full.
package main import ( "context" "encoding/json" "net/http" "strings" "github.com/go-playground/validator/v10" ) var validate = validator.New() // ---- schema (the gate) ---- type CreateBook struct { Name string `json:"name" validate:"required,min=5,max=100"` } type Book struct { ID int `json:"id"` Name string `json:"name"` } // ---- helper ---- func writeJSON(w http.ResponseWriter, status int, b any) { w.Header().Set("Content-Type", "application/json") w.WriteHeader(status) json.NewEncoder(w).Encode(b) } // ---- CONTROLLER: owns HTTP + the validation gate ---- func createBookHandler(w http.ResponseWriter, r *http.Request) { var body CreateBook // === GATE — runs before ANY business logic === if err := json.NewDecoder(r.Body).Decode(&body); err != nil { writeJSON(w, 400, map[string]string{ "error": "name: expected a string"}) return // 400 — the DB is never touched } body.Name = strings.TrimSpace(body.Name) // transform if err := validate.Struct(body); err != nil { writeJSON(w, 400, map[string]string{ "error": err.Error()}) return // 400 Bad Request, not a confusing 500 } // === only now: business logic (service → repo) === book, err := createBook(r.Context(), body.Name) if err != nil { writeJSON(w, 500, map[string]string{ "error": "could not create book"}) return } writeJSON(w, 201, book) } // ---- service + repository (sketched) ---- func createBook(ctx context.Context, name string) (*Book, error) { // service logic … repository INSERT … then: return &Book{ID: 1, Name: name}, nil } func main() { http.HandleFunc("/api/books", createBookHandler) http.ListenAndServe(":8080", nil) }
from fastapi import FastAPI, HTTPException from pydantic import BaseModel, Field, field_validator app = FastAPI() # ---- schema (the gate) ---- class CreateBook(BaseModel): name: str = Field(min_length=5, max_length=100) @field_validator("name") @classmethod def trim(cls, v: str) -> str: return v.strip() # transform / normalize class Book(BaseModel): id: int name: str # ---- service + repository (sketched) ---- def create_book(name: str) -> Book: # service logic … repository INSERT … then: return Book(id=1, name=name) # ---- CONTROLLER ---- # FastAPI runs the CreateBook schema (the GATE) BEFORE # this function body. A bad payload never enters the # function — the client gets an automatic 422/400, so # the DB is never touched and no client mistake leaks # out as a confusing 500. @app.post("/api/books", status_code=201) def create_book_endpoint(body: CreateBook) -> Book: # validation already passed — straight to logic try: return create_book(body.name) # service → repo except Exception: raise HTTPException( status_code=500, detail="could not create book") # run: uvicorn main:app --reload
400 instead of a confusing 500, come in four flavors (type, syntactic, semantic, complex), pair with transformations (casting and normalization) into one pipeline, and on the backend are mandatory for security and data integrity — no matter what the client does.