R

Asked • 09/26/19

Why is `[` better than `subset`?

When I need to filter a data.frame, i.e., extract rows that meet certain conditions, I prefer to use the `subset` function: subset(airquality, Month == 8 & Temp > 90)Rather than the `[` function: airquality[airquality$Month == 8 & airquality$Temp > 90, ]There are two main reasons for my preference: 1. I find the code reads better, from left to right. Even people who know nothing about R could tell what the `subset` statement above is doing. 2. Because columns can be referred to as variables in the `select` expression, I can save a few keystrokes. In my example above, I only had to type `airquality` once with `subset`, but three times with `[`.So I was living happy, using `subset` everywhere because it is shorter and reads better, even advocating its beauty to my fellow R coders. But yesterday my world broke apart. While reading the `subset` documentation, I notice this section:>Warning>This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.Could someone help clarify what the authors mean?First, what do they mean by "*for use interactively*"? I know what an interactive session is, as opposed to a script run in BATCH mode but I don't see what difference it should make.Then, could you please explain "*the non-standard evaluation of argument subset*" and why it is dangerous, maybe provide an example?

1 Expert Answer

By:

David B. answered • 02/20/21

Tutor
5.0 (257)

Math and Statistics need not be scary

Still looking for help? Get the right answer, fast.

Ask a question for free

Get a free answer to a quick problem.
Most questions answered within 4 hours.

OR

Find an Online Tutor Now

Choose an expert and meet online. No packages or subscriptions, pay only for the time you need.