uTea.ch Blog
a place to learn things

String cleaning

In which we see how to make our input strings safer and shinier

All our users are honourable people and none of them would ever try to exploit our software by feeding it bad information. Unfortunately, however, there are others (not our users) who might try to find weaknesses in our code, so it is up to us to make this as hard as possible for the bad actors.

One of the most common methods of finding a way into our code is through the front door. We have to let users interact with our program, so we have to leave open ways of having them give us input, but if we're not careful this may result in our code receiving information it is unable to handle properly. A famous example of this is the SQL injection attack, and if you don't know the story of Bobby Tables, here's a link to the excellent XKCD comic Exploits of a Mom.

The it is relatively simple to make a string safe, the hard part is remembering to do the cleaning whenever we take in information. If we expect ourselves to remember to use it (and other programmers working on our code to use it) we need to make it as simple as possible. I think the easiest approach is to extend the String class. We're going to add a.cleaned property which will provide the cleaned version of a string.

 1  public extension String {

 2  

 3      static var allowedCharacters: CharacterSet {

 4          var allowedCharacters = CharacterSet.alphanumerics

 5          allowedCharacters.formUnion(CharacterSet.whitespaces)

 6          allowedCharacters.formUnion(CharacterSet.punctuationCharacters)

 7          allowedCharacters.remove(charactersIn: #";\"#)

 8          return allowedCharacters

 9      }

10  

11      var cleaned: String {

12          return self.trimmingCharacters(in: .whitespacesAndNewlines).addingPercentEncoding(withAllowedCharacters: String.allowedCharacters) ?? ""

13      }

14  }

We can remove possibly dangerous characters from our strings by using the .addingPercentEncoding function12. To use this function we supply it with the characters which are allowed in the string; any characters not in our allowed list will be replaced with their escape code and that will render them ineffective in an attack. In this case we allow all the alphanumeric characters4, all the whitespace5 and punctuation6, but we remove the semicolon and backslash characters7.

Using our new property is as simple as remembering to add .cleaned whenever we first access a string that has come from any source we do not completely control.

 1  struct OnBoard1View: View {

 2      @State var enteredName: String = ""

 3      @State var navigateTo: String?

 4  

 5      var body: some View {

 6          ZStack {

 7              Color.background.edgesIgnoringSafeArea(.all)

 8              VStack(alignment: .center) {

 9                  Group {

10                      NavigationLink("", destination: OnBoard2View(name: enteredName.cleaned), tag: "OnBoard2View", selection: $navigateTo)

11                  }

12                  Text("Please enter your name")

13                  TextField("Your name", text: $enteredName)

14                      .lineLimit(1)

15                      .autocapitalization(.words)

16                  Button("Record name") {

17                      navigateTo = "OnBoard2View"

18                  }

19              }

20              .padding(20)

21          }

22      }

23  }

In this fragment of a SwiftUI page, you can see how cleaned is used in practice. This screen is part of the onboarding process in ourCare, its job is to collect the user's name and pass it on to the rest of the onboarding process so that it can be used to customise dialogs and create the user's database record.

We have the state variable enteredName2 bound to a text field so that the user can enter their name. We can't reasonably control what they type, so we can't trust the value in enteredName until we've cleaned it. When we use the name in the NavigationLink10 we add .cleaned to enteredName to ensure that the value we pass into the rest of the program is safe.


1 Those little numbers aren't footnotes, they refer to the line numbers in the block of code before the text.

Other places

Footer